crux.models package¶
Submodules¶
crux.models.dataset module¶
Module contains Dataset model.
-
class
crux.models.dataset.Dataset(id=None, owner_identity_id=None, contact_identity_id=None, name=None, description=None, website=None, created_at=None, modified_at=None, connection=None, raw_response=None, tags=None)¶ Bases:
crux.models.model.CruxModelDataset Model.
-
add_label(label_key, label_value)¶ Adds label to Dataset.
Parameters: - label_key (str) – Label Key for Dataset.
- label_value (str) – Label Value for Dataset.
Returns: True if labels are added.
Return type: bool
-
add_permission(identity_id, permission)¶ Adds permission to the Dataset.
Parameters: - identity_id – Identity Id to be set.
- permission – Permission to be set.
Returns: Permission Object.
Return type:
-
add_permission_to_resources(identity_id, permission, resource_paths=None, resource_objects=None, resource_ids=None)¶ Adds permission to all or specific Dataset resources.
Parameters: - identity_id (str) – Identity Id to be set.
- permission (str) – Permission to be set.
- resource_paths (
listofstr) – List of resource paths on which the permission should be applied. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset. - resource_objects (
listofcrux.models.Resource) – List of resource objects on which the permission should be applied. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset. - resource_ids (
listofstr) – List of resource ids on which permission should be applied. Overrides resource_pathss and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.
Returns: True if permission is applied.
Return type: bool
-
contact_identity_id¶ Gets the Contact Identity ID.
Type: str
-
create_file(path, tags=None, description=None)¶ Creates File resource in Dataset.
Parameters: - path (str) – Path of the file resource.
- tags (
listofstr) – Tags of the file resource. Defaults to None. - description (str) – Description of the file resource. Defaults to None.
Returns: File Object.
Return type:
-
create_folder(path, folder='/', tags=None, description=None)¶ Creates Folder resource in Dataset.
Parameters: - path (str) – Path of the Folder resource.
- folder (str) – Parent folder of the Folder resource. Defaults to /.
- tags (
listofstr) – Tags of the Folder resource. Defaults to None. - description (str) – Description of the Folder resource. Defaults to None.
Returns: Folder Object.
Return type:
-
create_query(path, config, tags=None, description=None)¶ Creates Query resource in Dataset.
Parameters: - path (str) – Query resource Path.
- config (dict) – Query configuration.
- tags (
listofstr) – Tags of the Query resource. Defaults to None. - description (str) – Description of the Query resource. Defaults to None.
Returns: Query Object.
Return type:
-
create_table(path, config, tags=None, description=None)¶ Creates Table resource in Dataset.
Parameters: - path (str) – Table resource Path.
- config (dict) – Table Schema Configuration.
- tags (
listofstr) – Tags of the Table resource. Defaults to None. - description (str) – Description of the Table resource. Defaults to None.
Returns: Table Object
Return type:
-
created_at¶ Gets the Dataset created_at.
Type: str
-
delete()¶ Deletes the dataset.
Returns: True if dataset is deleted. Return type: bool
-
delete_label(label_key)¶ Deletes label from Dataset.
Parameters: label_key (str) – Label Key for Dataset. Returns: True if labels are deleted. Return type: bool
-
delete_permission(identity_id, permission)¶ Deletes permission from the Dataset.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
Returns: True if it is able to delete it.
Return type: bool
-
delete_permission_from_resources(identity_id, permission, resource_paths=None, resource_objects=None, resource_ids=None)¶ Method which deletes permission from all or specific Dataset resources.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
- resource_paths (
listofcrux.models.Resource) – List of resource path from which the permission should be deleted. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset. - resource_objects (
listofcrux.models.Resource) – List of resource objects from which the permission should be deleted. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset. - resource_ids (
listofcrux.models.Resource) – List of resource ids from which the permission should be deleted. Overrides resource_paths and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.
Returns: True if it is able to delete the permission.
Return type: bool
-
description¶ Gets the Dataset Description.
Type: str
-
download_files(folder, local_path)¶ Downloads the resources recursively.
Parameters: - folder (str) – Crux Dataset Folder from where the file resources should be recursively downloaded.
- local_path (str) – Local OS Path where the file resources should be downloaded.
Returns: List of location of download files.
Return type: list (
str)Raises: ValueError– If Folder or local_path is None.OSError– If local_path is an invalid directory location.
-
find_resources_by_label(predicates, max_per_page=1000)¶ Method which searches the resouces for given labels in Dataset
Each predicate can be either:
- Lexicographical equal
- Lexicographical less than
- Lexicographical less than or equal to
- Lexicographical greater than
- Lexicographical greater than or equal to
- A list of OR predicates
- A list of AND predicates
predicates = [ {"op": "eq", "key": "key1", "val": "abcd"}, {"op": "ne", "key": "key1", "val": "zzzz"}, {"op": "lt", "key": "key1", "val": "abd"}, {"op": "gt", "key": "key1", "val": "abc"}, {"op": "lte", "key": "key1", "val": "abd"}, {"op": "gte", "key": "key1", "val": "abc"}, {"op": "or", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more OR predicates... ] }, {"op": "and", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more AND predicates... ] } ]
Parameters: - predicates (
listofdict) – List of dictionary predicates for finding resources. - max_per_page (int) – Pagination limit. Defaults to 1000.
Returns: List of resource matching the query parameters.
Return type: list (
crux.models.Resource)Example
from crux import Crux conn = Crux() dataset_object = conn.get_dataset(id="dataset_id") predicates=[ {"op":"eq","key":"test_label1","val":"test_value1"} ] resource_objects = dataset_object.find_resources_by_label( predicates=predicates )
-
classmethod
from_dict(a_dict)¶ Transforms Dataset Dictionary to Dataset object.
Parameters: a_dict (dict) – Dataset Dictionary. Returns: Dataset Object. Return type: crux.models.Dataset
-
get_file(path)¶ Gets the File resource object.
Parameters: path (str) – File resource path. Returns: File Object. Return type: crux.models.File
-
get_folder(path)¶ Gets the Folder resource object.
Parameters: path (str) – Folder resource path. Returns: Folder Object. Return type: crux.models.Folder
-
get_label(label_key)¶ Gets label value of Dataset.
Parameters: label_key (str) – Label Key for Dataset. Returns: Label Object. Return type: crux.models.Label
-
get_query(path)¶ Gets the Query resource object.
Parameters: path (str) – Query resource path. Returns: Query Object. Return type: crux.models.Query
-
get_stitch_job(job_id)¶ Stitch Job Details.
Parameters: job_id (str) – Job ID of the Stitch Job. Returns: StitchJob object. Return type: crux.models.StitchJob
-
get_table(path)¶ Method which gets the Table resource
Parameters: path – Table resource path Returns: Table Object Return type: crux.models.Table
-
id¶ Gets the Dataset ID.
Type: str
-
list_files(sort=None, folder='/', offset=0, limit=100)¶ Lists the files.
Parameters: - sort (str) – Sets whether to sort or not. Defaults to None.
- folder (str) – Folder for which resource should be listed. Defaults to /.
- offset (int) – Sets the offset. Defaults to 0.
- limit (int) – Sets the limit. Defaults to 100.
Returns: List of File objects.
Return type: list (
crux.models.File)
-
list_permissions()¶ Lists the permission on the Dataset.
Returns: List of Permission Objects. Return type: list ( crux.models.Permission)
-
list_resources(folder='/', offset=0, limit=1, include_folders=False, sort=None)¶ Lists the resources in Dataset.
Parameters: - folder (str) – Folder for which resource should be listed. Defaults to /.
- offset (int) – Sets the offset. Defaults to 0.
- limit (int) – Sets the limit. Defaults to 1.
- include_folders (bool) – Sets whether to include folders or not. Defaults to False.
- sort (str) – Sets whether to sort or not. Defaults to None.
Returns: List of File resource objects.
Return type: list (
crux.models.Resource)
-
load_table_from_file(source_file, dest_table, append=False)¶ Loads table from file resource.
Parameters: - source_file (str or file) – Source File Path in string or File Object.
- dest_table (str or crux.models.Table) – Destination File Path in string or Table Object.
- append (bool) – Sets whether to append to existing table. Defaults to False.
Returns: LoadJob Object.
Return type: Raises: TypeError– If source_file or dest_table is not file or string object.
-
modified_at¶ Gets the Dataset modified_at.
Type: str
-
name¶ Gets the Dataset Name.
Type: str
-
owner_identity_id¶ Gets the Owner Identity ID.
Type: str
-
provenance¶ Compute or Get the provenance.
Type: str
-
stitch(source_resources, destination_resource, labels=None, tags=None, description=None)¶ Method which stitches multiple Avro resources into single Avro resource
Parameters: - source_resources (
listofstrorfile) – List of resource paths which are to be stitched. - destination_resource (str) – Resource Path to load the stitched output
- labels (dict) – Key/Value labels that should be applied to stitched resource
- tags (
listofstr) – List of tags to be applied on destination resource. Taken into consideration if resource is required to be created. - description (str) – Description to be applied created destination. Taken into consideration if resource is required to be created.
Returns: - File object of destination resource.
Job ID for background running job.
Return type: tuple (
crux.models.File,str)- source_resources (
Gets the tags.
Type: str
-
to_dict()¶ Transforms Dataset object to Dataset Dictionary.
Returns: Dataset Dictionary. Return type: dict
-
update(name=None, description=None, tags=None)¶ Updates the metadata of dataset.
Parameters: - name (str) – Name of the dataset. Defaults to None.
- description (str) – Description of the dataset. Defaults to None.
- tags (
listofstr) – List of tags. Defaults to None.
Returns: True, if dataset is updated.
Return type: bool
Raises: ValueError– It is raised if name, description or tags are unset.TypeError– It is raised if tags is not of type list.
-
upload_file(src, dest, media_type=None, description=None, tags=None)¶ Uploads the File.
Parameters: - src (str or file) – Local OS path whose content is to be uploaded to file resource.
- dest (str) – File resource path.
- media_type (str) – Content type of the file. Defaults to None.
- description (str) – Description of the file. Defaults to None.
- tags (
listofstr) – Tags to be attached to the file resource.
Returns: File Object.
Return type:
-
upload_files(local_path, folder, media_type=None, description=None, tags=None)¶ Uploads the resources recursively.
Parameters: - local_path (str) – Local OS Path from where the file resources should be uploaded.
- media_type (str) – Content Types of File resources to be uploaded. Defaults to None.
- folder (str) – Crux Dataset Folder where file resources should be recursively uploaded.
- description (str) – Description to be set on uploaded resources. Defaults to None.
- tags (
listofstr) – Tags to be set on uploaded resources. Defaults to None.
Returns: List of uploaded file objects.
Return type: list (
crux.models.File)Raises: ValueError– If folder or local_path is None.OSError– If local_path is an invalid directory location.
-
upload_query(sql_file, path, description=None, tags=None)¶ Uploads the Query File.
Parameters: - path (str) – Query resource path.
- sql_file (str) – Local OS SQL file to be uploaded as query resource.
- description (str) – Description for the Query resource. Defaults to None.
- tags (
listofstr) – Tags for the Query resource. Defaults to None.
Returns: Query Object.
Return type:
-
website¶ Gets the Dataset Website.
Type: str
-
crux.models.file module¶
Module contains File model.
-
class
crux.models.file.File(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceFile Model.
-
download(dest, chunk_size=10485760)¶ Downloads the file resource.
Parameters: - dest (str or file) – Local OS path at which file resource will be downloaded.
- chunk_size (int) – Number of bytes to be read in memory.
Returns: True if it is downloaded.
Return type: bool
Raises: TypeError– If dest is not a file like or string type.
-
iter_content(chunk_size=10485760)¶ Streams the file resource.
Parameters: chunk_size (int) – Chunk Size for the stream. Yields: bytes – Bytes of file resource. Raises: ValueError– If chunk_size is not multiple of 256 KiB.
-
to_dict()¶ Transforms File object to File Dictionary.
Returns: File Dictionary. Return type: dict
-
upload(src, media_type=None)¶ Uploads the content to empty file resource.
Parameters: - src (str or file) – Local OS path whose content is to be uploaded.
- media_type (str) – Content type of the file. Defaults to None.
- Returns
- File: File model object.
Raises: TypeError– If src type is invalid.
-
crux.models.folder module¶
Module contains File model.
-
class
crux.models.folder.Folder(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceFolder Model.
-
add_permission(identity_id, permission, recursive=False)¶ Adds permission to the Folder resource.
Parameters: - identity_id (str) – Identity Id to be set.
- permission (str) – Permission to be set.
- recursive (bool) – If recursive is set to True, it will recursive apply permission to all resources under the folder resource.
Returns: - If recursive is set then it returns True.
If recursive is unset then it returns Permission object.
Return type: bool or crux.models.Permission
-
delete_permission(identity_id, permission, recursive=False)¶ Deletes permission from Folder resource.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for deletion.
- recursive (bool) – If recursive is set to True, it will recursively delete permission from all resources under the folder resource. Defaults to False.
Returns: True if it is able to delete it.
Return type: bool
-
to_dict()¶ Transforms Folder object to Folder Dictionary.
Returns: Folder Dictionary. Return type: dict
-
crux.models.identity module¶
Module contains Identity model.
-
class
crux.models.identity.Identity(identity_id=None, parent_identity_id=None, description=None, company_name=None, first_name=None, last_name=None, role=None, phone=None, email=None, type=None, website=None, landing_page=None, connection=None, raw_response=None)¶ Bases:
crux.models.model.CruxModelIdentity Model.
-
company_name¶ Gets the Company name.
Type: str
-
description¶ Gets the Description.
Type: str
-
email¶ Gets the Email.
Type: str
-
first_name¶ Gets the First name.
Type: str
-
classmethod
from_dict(a_dict)¶ Transforms Identity Dictionary to Identity object.
Parameters: a_dict (dict) – Identity Dictionary. Returns: Identity Object. Return type: crux.models.Identity
-
identity_id¶ Gets the Identity Id.
Type: str
-
landing_page¶ Gets the Landing Page.
Type: str
-
last_name¶ Gets the Last name.
Type: str
-
parent_identity_id¶ Gets the Parent Identity Id.
Type: str
-
phone¶ Gets the phone.
Type: str
-
role¶ Gets the Role.
Type: str
-
to_dict()¶ Transforms Identity object to Identity Dictionary.
Returns: Identity Dictionary. Return type: dict
-
type¶ Gets the Type.
Type: str
-
website¶ Gets the Website.
Type: str
-
crux.models.job module¶
Module contains AbstractJob, Job, LoadJob Model.
-
class
crux.models.job.AbstractJob¶ Bases:
crux.models.model.CruxModelAbstractJob Model.
-
class
crux.models.job.Job(job_id=None, status=None, statistics=None, connection=None)¶ Bases:
crux.models.job.AbstractJobJob Model.
-
classmethod
from_dict(a_dict)¶ Transforms Job Dictionary to Job object.
Parameters: a_dict (dict) – Job Dictionary. Returns: Job Object. Return type: crux.models.Job
-
classmethod
-
class
crux.models.job.Load(input_files=None, input_file_bytes=None, output_rows=None, output_bytes=None, bad_records=None)¶ Bases:
objectJob Load Model
-
classmethod
from_dict(a_dict)¶ Transforms Job Load Dictionary to Job Load object.
Parameters: a_dict (dict) – Job Load Dictionary. Returns: Job Load Object. Return type: crux.models.job.Load
-
classmethod
-
class
crux.models.job.LoadJob(job_id=None, job_url=None)¶ Bases:
crux.models.job.AbstractJobLoadJob Model.
-
classmethod
from_dict(a_dict)¶ Transforms LoadJob Dictionary to LoadJob object.
Parameters: a_dict (dict) – LoadJob Dictionary. Returns: LoadJob Object. Return type: crux.models.LoadJob
-
job_id¶ Gets the Job Id.
Type: str
-
job_url¶ Gets the Job URL.
Type: str
-
classmethod
-
class
crux.models.job.Statistics(creation_time=None, start_time=None, end_time=None, load=None)¶ Bases:
objectJob Statistic Model.
-
classmethod
from_dict(a_dict)¶ Transforms Job Statistics Dictionary to Job Statistics object.
Parameters: a_dict (dict) – Job Statistics Dictionary. Returns: Job Statistics Object. Return type: crux.models.job.Statistics
-
classmethod
-
class
crux.models.job.Status(state=None)¶ Bases:
objectJob Status Model.
-
classmethod
from_dict(a_dict)¶ Transforms Job Status Dictionary to Job Status object.
Parameters: a_dict (dict) – Job Status Dictionary. Returns: Job Status Object. Return type: crux.models.job.Status
-
classmethod
-
class
crux.models.job.StitchJob(job_id=None, status=None)¶ Bases:
crux.models.job.AbstractJobStitch Job Model.
-
classmethod
from_dict(a_dict)¶ Transforms Stitch Job Dictionary to Stitch Job object.
Parameters: a_dict (dict) – Stitch Job Dictionary. Returns: Stitch Job Object. Return type: crux.models.job.StitchJob
-
classmethod
crux.models.label module¶
Module contains Label model.
-
class
crux.models.label.Label(label_key=None, label_value=None)¶ Bases:
crux.models.model.CruxModelLabel Model.
-
classmethod
from_dict(a_dict)¶ Transforms Label Dictionary to Label object.
Parameters: a_dict (dict) – Label Dictionary. Returns: Label Object. Return type: crux.models.Label
-
to_dict()¶ Transforms Label object to Label Dictionary.
Returns: Label Dictionary. Return type: dict
-
classmethod
crux.models.model module¶
Module defines abstract CruxModel.
crux.models.permission module¶
Module contains Permission model.
-
class
crux.models.permission.Permission(target_id=None, identity_id=None, permission_name=None)¶ Bases:
crux.models.model.CruxModelPermission Model.
-
classmethod
from_dict(a_dict)¶ Transforms Dataset Dictionary to Dataset object.
Parameters: a_dict (dict) – Dataset Dictionary. Returns: Permission Object. Return type: crux.models.Permission
-
identity_id¶ Gets the Identity ID.
Type: str
-
permission_name¶ Gets the Permission Name.
Type: str
-
target_id¶ Gets the Target ID.
Type: str
-
to_dict()¶ Transforms Dataset object to Dataset Dictionary.
Returns: Dataset Dictionary. Return type: dict
-
classmethod
crux.models.query module¶
Module contains Query model.
-
class
crux.models.query.Query(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceQuery Model.
-
download(dest, format='csv', params=None)¶ Method which streams the Query
Parameters: - dest (str) – Local OS path at which resource will be downloaded.
- media_type (str) – Output format of the query. Defaults to csv.
- params (dict) – Run parameters. Defaults to None.
Returns: True if it is downloaded.
Return type: bool
-
run(format='csv', params=None, chunk_size=10485760, decode_unicode=False)¶ Method which streams the Query
Parameters: - format (str) – Output format of the query. Defaults to csv.
- params (dict) – Run parameters. Defaults to None.
- chunk_size (int) – Chunk Size for the stream
- decode_unicode (bool) – If decode_unicode is True,content will be decoded using the best available encoding based on the response. Defaults to False.
Yields: bytes – Bytes of content.
Raises: ValueError– If chunk size is not multiple of 256 KiB.
-
to_dict()¶ Transforms Query object to Query dictionary.
Returns: Query dictionary. Return type: dict
-
crux.models.resource module¶
Module contains Resource model.
-
class
crux.models.resource.MediaType¶ Bases:
enum.EnumMediaType Enumeration Model.
-
AVRO= 'avro/binary'¶
-
CSV= 'text/csv'¶
-
JSON= 'application/json'¶
-
NDJSON= 'application/x-ndjson'¶
-
PARQUET= 'application/parquet'¶
-
detect= <bound method MediaType.detect of <enum 'MediaType'>>¶
-
-
class
crux.models.resource.Resource(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.model.CruxModelResource Model.
-
add_label(label_key, label_value)¶ Adds label to Resource.
Parameters: - label_key (str) – Label Key for Resource.
- label_value (str) – Label Value for Resource.
Returns: True if label is added, False otherwise.
Return type: bool
-
add_permission(identity_id, permission)¶ Adds permission to the resource.
Parameters: - identity_id – Identity Id to be set.
- permission – Permission to be set.
Returns: Permission Object.
Return type:
-
as_of¶ Gets the as_of.
Type: str
-
config¶ Gets the config.
Type: str
-
created_at¶ Gets created_at.
Type: str
-
dataset_id¶ Gets the Dataset ID.
Type: str
-
delete()¶ Deletes Resource from Dataset.
Returns: True if it is deleted. Return type: bool
-
delete_label(label_key)¶ Deletes label from Resource.
Parameters: label_key (str) – Label Key for Resource. Returns: True if label is deleted, False otherwise. Return type: bool
-
delete_permission(identity_id, permission)¶ Deletes permission from the resource.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
Returns: True if it is able to delete it.
Return type: bool
-
description¶ Gets the Resource Description.
Type: str
-
folder¶ Compute or Get the folder name.
Type: str
-
folder_id¶ Gets the Folder ID.
Type: str
-
classmethod
from_dict(a_dict)¶ Transforms Resource Dictionary to Resource object.
Parameters: a_dict (dict) – Resource Dictionary. Returns: Resource Object. Return type: crux.models.Resource
-
id¶ Gets the Resource ID.
Type: str
-
labels¶ Gets the Resource labels.
Type: dict
-
list_permissions()¶ Lists the permission on the resource.
Returns: List of Permission Objects. Return type: list ( crux.models.Permission)
-
media_type¶ Gets the Resource Description.
Type: str
-
modified_at¶ Gets modified_at.
Type: str
-
name¶ Gets the Resource Name.
Type: str
-
path¶ Compute or Get the resource path.
Type: str
-
provenance¶ Gets the Provenance.
Type: str
-
refresh()¶ Refresh Resource model from API backend.
Returns: - True, if it is able to refresh the model,
- False otherwise.
Return type: bool
-
size¶ Gets the size.
Type: int
-
storage_id¶ Gets the Storage ID.
Type: str
Gets the Resource Tags.
Type: listofstr
-
to_dict()¶ Transforms Resource object to Resource Dictionary.
Returns: Resource Dictionary. Return type: dict
-
type¶ Gets the Resource Type.
Type: str
-
update(name=None, description=None, tags=None)¶ Updates the metadata for Resource.
Parameters: - name (str) – Name of resource. Defaults to None.
- description (str) – Description of the resource. Defaults to None.
- tags (
listofstr) – List of tags. Defaults to None.
Returns: True, if resource is updated.
Return type: bool
Raises: ValueError– It is raised if name, description or tags are unset.TypeError– It is raised if tags are not of type List.
-
crux.models.table module¶
Module contains Table model.
-
class
crux.models.table.Table(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceTable model.
-
download(dest, media_type, chunk_size=10485760)¶ Downloads the table resource.
Parameters: - dest (str or file) – Local OS path at which file resource will be downloaded.
- media_type (str) – Content Type for download.
- chunk_size (int) – Number of bytes to be read in memory.
Returns: True if it is downloaded.
Return type: bool
Raises: TypeError– If dest is not a file like or string type.
-
to_dict()¶ Transforms Table object to Table Dictionary.
Returns: Table Dictionary. Return type: dict
-
Module contents¶
Module containing models that represent objects returned by the API.
-
class
crux.models.Identity(identity_id=None, parent_identity_id=None, description=None, company_name=None, first_name=None, last_name=None, role=None, phone=None, email=None, type=None, website=None, landing_page=None, connection=None, raw_response=None)¶ Bases:
crux.models.model.CruxModelIdentity Model.
-
company_name¶ Gets the Company name.
Type: str
-
description¶ Gets the Description.
Type: str
-
email¶ Gets the Email.
Type: str
-
first_name¶ Gets the First name.
Type: str
-
classmethod
from_dict(a_dict)¶ Transforms Identity Dictionary to Identity object.
Parameters: a_dict (dict) – Identity Dictionary. Returns: Identity Object. Return type: crux.models.Identity
-
identity_id¶ Gets the Identity Id.
Type: str
-
landing_page¶ Gets the Landing Page.
Type: str
-
last_name¶ Gets the Last name.
Type: str
-
parent_identity_id¶ Gets the Parent Identity Id.
Type: str
-
phone¶ Gets the phone.
Type: str
-
role¶ Gets the Role.
Type: str
-
to_dict()¶ Transforms Identity object to Identity Dictionary.
Returns: Identity Dictionary. Return type: dict
-
type¶ Gets the Type.
Type: str
-
website¶ Gets the Website.
Type: str
-
-
class
crux.models.Permission(target_id=None, identity_id=None, permission_name=None)¶ Bases:
crux.models.model.CruxModelPermission Model.
-
classmethod
from_dict(a_dict)¶ Transforms Dataset Dictionary to Dataset object.
Parameters: a_dict (dict) – Dataset Dictionary. Returns: Permission Object. Return type: crux.models.Permission
-
identity_id¶ Gets the Identity ID.
Type: str
-
permission_name¶ Gets the Permission Name.
Type: str
-
target_id¶ Gets the Target ID.
Type: str
-
to_dict()¶ Transforms Dataset object to Dataset Dictionary.
Returns: Dataset Dictionary. Return type: dict
-
classmethod
-
class
crux.models.LoadJob(job_id=None, job_url=None)¶ Bases:
crux.models.job.AbstractJobLoadJob Model.
-
classmethod
from_dict(a_dict)¶ Transforms LoadJob Dictionary to LoadJob object.
Parameters: a_dict (dict) – LoadJob Dictionary. Returns: LoadJob Object. Return type: crux.models.LoadJob
-
job_id¶ Gets the Job Id.
Type: str
-
job_url¶ Gets the Job URL.
Type: str
-
classmethod
-
class
crux.models.StitchJob(job_id=None, status=None)¶ Bases:
crux.models.job.AbstractJobStitch Job Model.
-
classmethod
from_dict(a_dict)¶ Transforms Stitch Job Dictionary to Stitch Job object.
Parameters: a_dict (dict) – Stitch Job Dictionary. Returns: Stitch Job Object. Return type: crux.models.job.StitchJob
-
classmethod
-
class
crux.models.Job(job_id=None, status=None, statistics=None, connection=None)¶ Bases:
crux.models.job.AbstractJobJob Model.
-
classmethod
from_dict(a_dict)¶ Transforms Job Dictionary to Job object.
Parameters: a_dict (dict) – Job Dictionary. Returns: Job Object. Return type: crux.models.Job
-
classmethod
-
class
crux.models.Resource(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.model.CruxModelResource Model.
-
add_label(label_key, label_value)¶ Adds label to Resource.
Parameters: - label_key (str) – Label Key for Resource.
- label_value (str) – Label Value for Resource.
Returns: True if label is added, False otherwise.
Return type: bool
-
add_permission(identity_id, permission)¶ Adds permission to the resource.
Parameters: - identity_id – Identity Id to be set.
- permission – Permission to be set.
Returns: Permission Object.
Return type:
-
as_of¶ Gets the as_of.
Type: str
-
config¶ Gets the config.
Type: str
-
created_at¶ Gets created_at.
Type: str
-
dataset_id¶ Gets the Dataset ID.
Type: str
-
delete()¶ Deletes Resource from Dataset.
Returns: True if it is deleted. Return type: bool
-
delete_label(label_key)¶ Deletes label from Resource.
Parameters: label_key (str) – Label Key for Resource. Returns: True if label is deleted, False otherwise. Return type: bool
-
delete_permission(identity_id, permission)¶ Deletes permission from the resource.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
Returns: True if it is able to delete it.
Return type: bool
-
description¶ Gets the Resource Description.
Type: str
-
folder¶ Compute or Get the folder name.
Type: str
-
folder_id¶ Gets the Folder ID.
Type: str
-
classmethod
from_dict(a_dict)¶ Transforms Resource Dictionary to Resource object.
Parameters: a_dict (dict) – Resource Dictionary. Returns: Resource Object. Return type: crux.models.Resource
-
id¶ Gets the Resource ID.
Type: str
-
labels¶ Gets the Resource labels.
Type: dict
-
list_permissions()¶ Lists the permission on the resource.
Returns: List of Permission Objects. Return type: list ( crux.models.Permission)
-
media_type¶ Gets the Resource Description.
Type: str
-
modified_at¶ Gets modified_at.
Type: str
-
name¶ Gets the Resource Name.
Type: str
-
path¶ Compute or Get the resource path.
Type: str
-
provenance¶ Gets the Provenance.
Type: str
-
refresh()¶ Refresh Resource model from API backend.
Returns: - True, if it is able to refresh the model,
- False otherwise.
Return type: bool
-
size¶ Gets the size.
Type: int
-
storage_id¶ Gets the Storage ID.
Type: str
Gets the Resource Tags.
Type: listofstr
-
to_dict()¶ Transforms Resource object to Resource Dictionary.
Returns: Resource Dictionary. Return type: dict
-
type¶ Gets the Resource Type.
Type: str
-
update(name=None, description=None, tags=None)¶ Updates the metadata for Resource.
Parameters: - name (str) – Name of resource. Defaults to None.
- description (str) – Description of the resource. Defaults to None.
- tags (
listofstr) – List of tags. Defaults to None.
Returns: True, if resource is updated.
Return type: bool
Raises: ValueError– It is raised if name, description or tags are unset.TypeError– It is raised if tags are not of type List.
-
-
class
crux.models.File(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceFile Model.
-
download(dest, chunk_size=10485760)¶ Downloads the file resource.
Parameters: - dest (str or file) – Local OS path at which file resource will be downloaded.
- chunk_size (int) – Number of bytes to be read in memory.
Returns: True if it is downloaded.
Return type: bool
Raises: TypeError– If dest is not a file like or string type.
-
iter_content(chunk_size=10485760)¶ Streams the file resource.
Parameters: chunk_size (int) – Chunk Size for the stream. Yields: bytes – Bytes of file resource. Raises: ValueError– If chunk_size is not multiple of 256 KiB.
-
to_dict()¶ Transforms File object to File Dictionary.
Returns: File Dictionary. Return type: dict
-
upload(src, media_type=None)¶ Uploads the content to empty file resource.
Parameters: - src (str or file) – Local OS path whose content is to be uploaded.
- media_type (str) – Content type of the file. Defaults to None.
- Returns
- File: File model object.
Raises: TypeError– If src type is invalid.
-
-
class
crux.models.Folder(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceFolder Model.
-
add_permission(identity_id, permission, recursive=False)¶ Adds permission to the Folder resource.
Parameters: - identity_id (str) – Identity Id to be set.
- permission (str) – Permission to be set.
- recursive (bool) – If recursive is set to True, it will recursive apply permission to all resources under the folder resource.
Returns: - If recursive is set then it returns True.
If recursive is unset then it returns Permission object.
Return type: bool or crux.models.Permission
-
delete_permission(identity_id, permission, recursive=False)¶ Deletes permission from Folder resource.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for deletion.
- recursive (bool) – If recursive is set to True, it will recursively delete permission from all resources under the folder resource. Defaults to False.
Returns: True if it is able to delete it.
Return type: bool
-
to_dict()¶ Transforms Folder object to Folder Dictionary.
Returns: Folder Dictionary. Return type: dict
-
-
class
crux.models.Table(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceTable model.
-
download(dest, media_type, chunk_size=10485760)¶ Downloads the table resource.
Parameters: - dest (str or file) – Local OS path at which file resource will be downloaded.
- media_type (str) – Content Type for download.
- chunk_size (int) – Number of bytes to be read in memory.
Returns: True if it is downloaded.
Return type: bool
Raises: TypeError– If dest is not a file like or string type.
-
to_dict()¶ Transforms Table object to Table Dictionary.
Returns: Table Dictionary. Return type: dict
-
-
class
crux.models.Dataset(id=None, owner_identity_id=None, contact_identity_id=None, name=None, description=None, website=None, created_at=None, modified_at=None, connection=None, raw_response=None, tags=None)¶ Bases:
crux.models.model.CruxModelDataset Model.
-
add_label(label_key, label_value)¶ Adds label to Dataset.
Parameters: - label_key (str) – Label Key for Dataset.
- label_value (str) – Label Value for Dataset.
Returns: True if labels are added.
Return type: bool
-
add_permission(identity_id, permission)¶ Adds permission to the Dataset.
Parameters: - identity_id – Identity Id to be set.
- permission – Permission to be set.
Returns: Permission Object.
Return type:
-
add_permission_to_resources(identity_id, permission, resource_paths=None, resource_objects=None, resource_ids=None)¶ Adds permission to all or specific Dataset resources.
Parameters: - identity_id (str) – Identity Id to be set.
- permission (str) – Permission to be set.
- resource_paths (
listofstr) – List of resource paths on which the permission should be applied. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset. - resource_objects (
listofcrux.models.Resource) – List of resource objects on which the permission should be applied. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset. - resource_ids (
listofstr) – List of resource ids on which permission should be applied. Overrides resource_pathss and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.
Returns: True if permission is applied.
Return type: bool
-
contact_identity_id¶ Gets the Contact Identity ID.
Type: str
-
create_file(path, tags=None, description=None)¶ Creates File resource in Dataset.
Parameters: - path (str) – Path of the file resource.
- tags (
listofstr) – Tags of the file resource. Defaults to None. - description (str) – Description of the file resource. Defaults to None.
Returns: File Object.
Return type:
-
create_folder(path, folder='/', tags=None, description=None)¶ Creates Folder resource in Dataset.
Parameters: - path (str) – Path of the Folder resource.
- folder (str) – Parent folder of the Folder resource. Defaults to /.
- tags (
listofstr) – Tags of the Folder resource. Defaults to None. - description (str) – Description of the Folder resource. Defaults to None.
Returns: Folder Object.
Return type:
-
create_query(path, config, tags=None, description=None)¶ Creates Query resource in Dataset.
Parameters: - path (str) – Query resource Path.
- config (dict) – Query configuration.
- tags (
listofstr) – Tags of the Query resource. Defaults to None. - description (str) – Description of the Query resource. Defaults to None.
Returns: Query Object.
Return type:
-
create_table(path, config, tags=None, description=None)¶ Creates Table resource in Dataset.
Parameters: - path (str) – Table resource Path.
- config (dict) – Table Schema Configuration.
- tags (
listofstr) – Tags of the Table resource. Defaults to None. - description (str) – Description of the Table resource. Defaults to None.
Returns: Table Object
Return type:
-
created_at¶ Gets the Dataset created_at.
Type: str
-
delete()¶ Deletes the dataset.
Returns: True if dataset is deleted. Return type: bool
-
delete_label(label_key)¶ Deletes label from Dataset.
Parameters: label_key (str) – Label Key for Dataset. Returns: True if labels are deleted. Return type: bool
-
delete_permission(identity_id, permission)¶ Deletes permission from the Dataset.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
Returns: True if it is able to delete it.
Return type: bool
-
delete_permission_from_resources(identity_id, permission, resource_paths=None, resource_objects=None, resource_ids=None)¶ Method which deletes permission from all or specific Dataset resources.
Parameters: - identity_id (str) – Identity Id for the deletion.
- permission (str) – Permission for the deletion.
- resource_paths (
listofcrux.models.Resource) – List of resource path from which the permission should be deleted. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset. - resource_objects (
listofcrux.models.Resource) – List of resource objects from which the permission should be deleted. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset. - resource_ids (
listofcrux.models.Resource) – List of resource ids from which the permission should be deleted. Overrides resource_paths and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.
Returns: True if it is able to delete the permission.
Return type: bool
-
description¶ Gets the Dataset Description.
Type: str
-
download_files(folder, local_path)¶ Downloads the resources recursively.
Parameters: - folder (str) – Crux Dataset Folder from where the file resources should be recursively downloaded.
- local_path (str) – Local OS Path where the file resources should be downloaded.
Returns: List of location of download files.
Return type: list (
str)Raises: ValueError– If Folder or local_path is None.OSError– If local_path is an invalid directory location.
-
find_resources_by_label(predicates, max_per_page=1000)¶ Method which searches the resouces for given labels in Dataset
Each predicate can be either:
- Lexicographical equal
- Lexicographical less than
- Lexicographical less than or equal to
- Lexicographical greater than
- Lexicographical greater than or equal to
- A list of OR predicates
- A list of AND predicates
predicates = [ {"op": "eq", "key": "key1", "val": "abcd"}, {"op": "ne", "key": "key1", "val": "zzzz"}, {"op": "lt", "key": "key1", "val": "abd"}, {"op": "gt", "key": "key1", "val": "abc"}, {"op": "lte", "key": "key1", "val": "abd"}, {"op": "gte", "key": "key1", "val": "abc"}, {"op": "or", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more OR predicates... ] }, {"op": "and", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more AND predicates... ] } ]
Parameters: - predicates (
listofdict) – List of dictionary predicates for finding resources. - max_per_page (int) – Pagination limit. Defaults to 1000.
Returns: List of resource matching the query parameters.
Return type: list (
crux.models.Resource)Example
from crux import Crux conn = Crux() dataset_object = conn.get_dataset(id="dataset_id") predicates=[ {"op":"eq","key":"test_label1","val":"test_value1"} ] resource_objects = dataset_object.find_resources_by_label( predicates=predicates )
-
classmethod
from_dict(a_dict)¶ Transforms Dataset Dictionary to Dataset object.
Parameters: a_dict (dict) – Dataset Dictionary. Returns: Dataset Object. Return type: crux.models.Dataset
-
get_file(path)¶ Gets the File resource object.
Parameters: path (str) – File resource path. Returns: File Object. Return type: crux.models.File
-
get_folder(path)¶ Gets the Folder resource object.
Parameters: path (str) – Folder resource path. Returns: Folder Object. Return type: crux.models.Folder
-
get_label(label_key)¶ Gets label value of Dataset.
Parameters: label_key (str) – Label Key for Dataset. Returns: Label Object. Return type: crux.models.Label
-
get_query(path)¶ Gets the Query resource object.
Parameters: path (str) – Query resource path. Returns: Query Object. Return type: crux.models.Query
-
get_stitch_job(job_id)¶ Stitch Job Details.
Parameters: job_id (str) – Job ID of the Stitch Job. Returns: StitchJob object. Return type: crux.models.StitchJob
-
get_table(path)¶ Method which gets the Table resource
Parameters: path – Table resource path Returns: Table Object Return type: crux.models.Table
-
id¶ Gets the Dataset ID.
Type: str
-
list_files(sort=None, folder='/', offset=0, limit=100)¶ Lists the files.
Parameters: - sort (str) – Sets whether to sort or not. Defaults to None.
- folder (str) – Folder for which resource should be listed. Defaults to /.
- offset (int) – Sets the offset. Defaults to 0.
- limit (int) – Sets the limit. Defaults to 100.
Returns: List of File objects.
Return type: list (
crux.models.File)
-
list_permissions()¶ Lists the permission on the Dataset.
Returns: List of Permission Objects. Return type: list ( crux.models.Permission)
-
list_resources(folder='/', offset=0, limit=1, include_folders=False, sort=None)¶ Lists the resources in Dataset.
Parameters: - folder (str) – Folder for which resource should be listed. Defaults to /.
- offset (int) – Sets the offset. Defaults to 0.
- limit (int) – Sets the limit. Defaults to 1.
- include_folders (bool) – Sets whether to include folders or not. Defaults to False.
- sort (str) – Sets whether to sort or not. Defaults to None.
Returns: List of File resource objects.
Return type: list (
crux.models.Resource)
-
load_table_from_file(source_file, dest_table, append=False)¶ Loads table from file resource.
Parameters: - source_file (str or file) – Source File Path in string or File Object.
- dest_table (str or crux.models.Table) – Destination File Path in string or Table Object.
- append (bool) – Sets whether to append to existing table. Defaults to False.
Returns: LoadJob Object.
Return type: Raises: TypeError– If source_file or dest_table is not file or string object.
-
modified_at¶ Gets the Dataset modified_at.
Type: str
-
name¶ Gets the Dataset Name.
Type: str
-
owner_identity_id¶ Gets the Owner Identity ID.
Type: str
-
provenance¶ Compute or Get the provenance.
Type: str
-
stitch(source_resources, destination_resource, labels=None, tags=None, description=None)¶ Method which stitches multiple Avro resources into single Avro resource
Parameters: - source_resources (
listofstrorfile) – List of resource paths which are to be stitched. - destination_resource (str) – Resource Path to load the stitched output
- labels (dict) – Key/Value labels that should be applied to stitched resource
- tags (
listofstr) – List of tags to be applied on destination resource. Taken into consideration if resource is required to be created. - description (str) – Description to be applied created destination. Taken into consideration if resource is required to be created.
Returns: - File object of destination resource.
Job ID for background running job.
Return type: tuple (
crux.models.File,str)- source_resources (
Gets the tags.
Type: str
-
to_dict()¶ Transforms Dataset object to Dataset Dictionary.
Returns: Dataset Dictionary. Return type: dict
-
update(name=None, description=None, tags=None)¶ Updates the metadata of dataset.
Parameters: - name (str) – Name of the dataset. Defaults to None.
- description (str) – Description of the dataset. Defaults to None.
- tags (
listofstr) – List of tags. Defaults to None.
Returns: True, if dataset is updated.
Return type: bool
Raises: ValueError– It is raised if name, description or tags are unset.TypeError– It is raised if tags is not of type list.
-
upload_file(src, dest, media_type=None, description=None, tags=None)¶ Uploads the File.
Parameters: - src (str or file) – Local OS path whose content is to be uploaded to file resource.
- dest (str) – File resource path.
- media_type (str) – Content type of the file. Defaults to None.
- description (str) – Description of the file. Defaults to None.
- tags (
listofstr) – Tags to be attached to the file resource.
Returns: File Object.
Return type:
-
upload_files(local_path, folder, media_type=None, description=None, tags=None)¶ Uploads the resources recursively.
Parameters: - local_path (str) – Local OS Path from where the file resources should be uploaded.
- media_type (str) – Content Types of File resources to be uploaded. Defaults to None.
- folder (str) – Crux Dataset Folder where file resources should be recursively uploaded.
- description (str) – Description to be set on uploaded resources. Defaults to None.
- tags (
listofstr) – Tags to be set on uploaded resources. Defaults to None.
Returns: List of uploaded file objects.
Return type: list (
crux.models.File)Raises: ValueError– If folder or local_path is None.OSError– If local_path is an invalid directory location.
-
upload_query(sql_file, path, description=None, tags=None)¶ Uploads the Query File.
Parameters: - path (str) – Query resource path.
- sql_file (str) – Local OS SQL file to be uploaded as query resource.
- description (str) – Description for the Query resource. Defaults to None.
- tags (
listofstr) – Tags for the Query resource. Defaults to None.
Returns: Query Object.
Return type:
-
website¶ Gets the Dataset Website.
Type: str
-
-
class
crux.models.Query(id=None, dataset_id=None, folder_id=None, folder=None, name=None, size=None, type=None, config=None, provenance=None, as_of=None, created_at=None, modified_at=None, storage_id=None, description=None, media_type=None, tags=None, labels=None, connection=None, raw_response=None)¶ Bases:
crux.models.resource.ResourceQuery Model.
-
download(dest, format='csv', params=None)¶ Method which streams the Query
Parameters: - dest (str) – Local OS path at which resource will be downloaded.
- media_type (str) – Output format of the query. Defaults to csv.
- params (dict) – Run parameters. Defaults to None.
Returns: True if it is downloaded.
Return type: bool
-
run(format='csv', params=None, chunk_size=10485760, decode_unicode=False)¶ Method which streams the Query
Parameters: - format (str) – Output format of the query. Defaults to csv.
- params (dict) – Run parameters. Defaults to None.
- chunk_size (int) – Chunk Size for the stream
- decode_unicode (bool) – If decode_unicode is True,content will be decoded using the best available encoding based on the response. Defaults to False.
Yields: bytes – Bytes of content.
Raises: ValueError– If chunk size is not multiple of 256 KiB.
-
to_dict()¶ Transforms Query object to Query dictionary.
Returns: Query dictionary. Return type: dict
-
-
class
crux.models.Label(label_key=None, label_value=None)¶ Bases:
crux.models.model.CruxModelLabel Model.
-
classmethod
from_dict(a_dict)¶ Transforms Label Dictionary to Label object.
Parameters: a_dict (dict) – Label Dictionary. Returns: Label Object. Return type: crux.models.Label
-
to_dict()¶ Transforms Label object to Label Dictionary.
Returns: Label Dictionary. Return type: dict
-
classmethod