crux.models package¶
Submodules¶
crux.models.dataset module¶
Module contains Dataset model.
-
class
crux.models.dataset.
Dataset
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Dataset Model.
-
add_label
(label_key: str, label_value: str) → bool¶ Adds label to Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
label_value (str) – Label Value for Dataset.
- Returns
True if labels are added.
- Return type
bool
-
add_permission
(identity_id: str, permission: str) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the Dataset.
- Parameters
identity_id – Identity Id to be set.
permission – Permission to be set.
- Returns
Permission Object.
- Return type
-
add_permission_to_resources
(identity_id: str, permission: str, resource_paths: Optional[List[str]] = None, resource_objects: Optional[List[Union[crux.models.file.File, crux.models.folder.Folder]]] = None, resource_ids: Optional[List[str]] = None) → bool¶ Adds permission to all or specific Dataset resources.
- Parameters
identity_id (str) – Identity Id to be set.
permission (str) – Permission to be set.
resource_paths (
list
ofstr
) – List of resource paths on which the permission should be applied. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.resource_objects (
list
ofcrux.models.Resource
) – List of resource objects on which the permission should be applied. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.resource_ids (
list
ofstr
) – List of resource ids on which permission should be applied. Overrides resource_pathss and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.
- Returns
True if permission is applied.
- Return type
bool
-
property
contact_identity_id
¶ Gets the Contact Identity ID.
- Type
str
-
create
() → bool¶ Creates the Dataset.
- Returns
True if dataset is created.
- Return type
bool
-
create_file
(path: str, tags: Optional[List[str]] = None, description: Optional[str] = None) → crux.models.file.File¶ Creates File resource in Dataset.
- Parameters
path (str) – Path of the file resource.
tags (
list
ofstr
) – Tags of the file resource. Defaults to None.description (str) – Description of the file resource.
- Returns
File Object.
- Return type
-
create_folder
(path: str, folder: str = '/', tags: Optional[List[str]] = None, description: Optional[str] = None) → crux.models.folder.Folder¶ Creates Folder resource in Dataset.
- Parameters
path (str) – Path of the Folder resource.
folder (str) – Parent folder of the Folder resource. Defaults to /.
tags (
list
ofstr
) – Tags of the Folder resource. Defaults to None.description (str) – Description of the Folder resource. Defaults to None.
- Returns
Folder Object.
- Return type
-
property
created_at
¶ Gets the Dataset created_at.
- Type
str
-
delete
() → bool¶ Deletes the Dataset.
- Returns
True if dataset is deleted.
- Return type
bool
-
delete_label
(label_key: str) → bool¶ Deletes label from Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
- Returns
True if labels are deleted.
- Return type
bool
-
delete_permission
(identity_id: str, permission: str) → bool¶ Deletes permission from the Dataset.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
- Returns
True if it is able to delete it.
- Return type
bool
-
delete_permission_from_resources
(identity_id: str, permission: str, resource_paths: Optional[List[str]] = None, resource_objects: Optional[List[Union[crux.models.file.File, crux.models.folder.Folder]]] = None, resource_ids: Optional[List[str]] = None) → bool¶ Method which deletes permission from all or specific Dataset resources.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
resource_paths (
list
ofcrux.models.Resource
) – List of resource path from which the permission should be deleted. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.resource_objects (
list
ofcrux.models.Resource
) – List of resource objects from which the permission should be deleted. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.resource_ids (
list
ofcrux.models.Resource
) – List of resource ids from which the permission should be deleted. Overrides resource_paths and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.
- Returns
True if it is able to delete the permission.
- Return type
bool
-
property
description
¶ Gets the Dataset Description.
- Type
str
-
download_files
(folder: str, local_path: str, only_use_crux_domains: Optional[bool] = None) → List[str]¶ Downloads the resources recursively.
- Parameters
folder (str) – Crux Dataset Folder from where the file resources should be recursively downloaded.
local_path (str) – Local OS Path where the file resources should be downloaded.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
List of location of download files.
- Return type
list (
str
)- Raises
ValueError – If Folder or local_path is None.
OSError – If local_path is an invalid directory location.
-
find_resources_by_label
(predicates: List[Dict[str, str]], max_per_page: int = 1000) → Iterator[Union[crux.models.file.File, crux.models.folder.Folder]]¶ Method which searches the resouces for given labels in Dataset
Each predicate can be either:
Lexicographical equal
Lexicographical less than
Lexicographical less than or equal to
Lexicographical greater than
Lexicographical greater than or equal to
A list of OR predicates
A list of AND predicates
predicates = [ {"op": "eq", "key": "key1", "val": "abcd"}, {"op": "ne", "key": "key1", "val": "zzzz"}, {"op": "lt", "key": "key1", "val": "abd"}, {"op": "gt", "key": "key1", "val": "abc"}, {"op": "lte", "key": "key1", "val": "abd"}, {"op": "gte", "key": "key1", "val": "abc"}, {"op": "or", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more OR predicates... ] }, {"op": "and", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more AND predicates... ] } ]
- Parameters
predicates (
list
ofdict
) – List of dictionary predicates for finding resources.max_per_page (int) – Pagination limit. Defaults to 1000.
- Returns
List of resource matching the query parameters.
- Return type
list (
crux.models.Resource
)
Example
from crux import Crux conn = Crux() dataset_object = conn.get_dataset(id="dataset_id") predicates=[ {"op":"eq","key":"test_label1","val":"test_value1"} ] resource_objects = dataset_object.find_resources_by_label( predicates=predicates )
-
get_delivery
(delivery_id: str) → crux.models.delivery.Delivery¶ Gets Delivery object.
- Parameters
delivery_id (str) – Delivery ID.
- Returns
Delivery Object.
- Return type
- Raises
ValueError – If delivery_id value is invalid.
-
get_file
(path: str) → crux.models.file.File¶ Gets the File resource object.
- Parameters
path (str) – File resource path.
- Returns
File Object.
- Return type
-
get_files_range
(start_date: Union[datetime.datetime, str], end_date: Union[datetime.datetime, str, None] = None, frames: Union[str, list, None] = None, file_format: str = 'avro/binary', dayfirst: bool = False, yearfirst: bool = False, latest_only: bool = False, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.file.File]¶ Get a set of dataset file resources. The best single delivery version for each supplier_implied_dt is selected for the given time range.
- Parameters
start_date (str) – ISO format start datetime or any paresable date string
end_date (str) – ISO format end datetime or any parseable date string
delivery_status (str) – Delivery status enum
frames (str, list) – filter for selected frames
file_format (str) – File format of delivery
dayfirst (str) – Parse format for str date
yearfirst (str) – Parse format for str date
latest_only (bool) – Return latest files only
delivery_status – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_folder
(path: str) → crux.models.folder.Folder¶ Gets the Folder resource object.
- Parameters
path (str) – Folder resource path.
- Returns
Folder Object.
- Return type
-
get_ingestions
(start_date: Optional[str] = None, end_date: Optional[str] = None, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.ingestion.Ingestion]¶ Gets Ingestions.
- Parameters
start_date (str) – ISO format start time.
end_date (str) – ISO format end time.
delivery_status (str) – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
Delivery Object.
- Return type
-
get_label
(label_key: str) → crux.models.label.Label¶ Gets label value of Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
- Returns
Label Object.
- Return type
-
get_latest_files
(frames: Union[str, List, None] = None, file_format: str = 'avro/binary', cutoff_date: Optional[str] = None, dayfirst: bool = False, yearfirst: bool = False, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.file.File]¶ Get the latest dataset file resources. The latest supplier_implied_dt with the best single delivery version for each frame is selected.
- Parameters
frames (str, list) – filter for selected frames
file_format (str) – File format of delivery
cutoff_date (datetime, str) – Search up to this date
dayfirst (str) – Parse format for str date
yearfirst (str) – Parse format for str date
delivery_status (str) – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_resources_batch
(resource_ids: list) → Iterator[crux.models.file.File]¶ Gets resource metadata.
- Parameters
resource_ids (list) – List of resource IDs
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_stitch_job
(job_id: str) → crux.models.job.StitchJob¶ Stitch Job Details.
- Parameters
job_id (str) – Job ID of the Stitch Job.
- Returns
StitchJob object.
- Return type
-
property
id
¶ Gets the Dataset ID.
- Type
str
-
list_files
(sort: Optional[str] = None, folder: str = '/', cursor: Optional[str] = None, limit: int = 100) → List[crux.models.file.File]¶ Lists the files.
- Parameters
sort (str) – Sets whether to sort or not. Defaults to None.
folder (str) – Folder for which resource should be listed. Defaults to /.
cursor (str) – Sets the offset to the page cursor. Defaults to None.
limit (int) – Sets the limit. Defaults to 100.
- Returns
List of File objects.
- Return type
list (
crux.models.File
)
-
list_permissions
() → List[crux.models.permission.Permission]¶ Lists the permission on the Dataset.
- Returns
List of Permission Objects.
- Return type
list (
crux.models.Permission
)
-
list_resources
(folder: str = '/', cursor: str = None, limit: int = 1, include_folders: bool = False, sort: str = None) → Generator[Resource]¶ Lists the resources in Dataset.
- Parameters
folder (str) – Folder for which resource should be listed. Defaults to /.
cursor (str) – Sets the offset to the page cursor. Defaults to None.
limit (int) – Sets the limit. Defaults to 1.
include_folders (bool) – Sets whether to include folders or not. Defaults to False.
sort (str) – Sets whether to sort or not. Defaults to None.
- Returns
List of File resource objects.
- Return type
list (
crux.models.Resource
)
-
property
modified_at
¶ Gets the Dataset modified_at.
- Type
str
-
property
name
¶ Gets the Dataset Name.
- Type
str
-
property
owner_identity_id
¶ Gets the Owner Identity ID.
- Type
str
-
property
provenance
¶ Compute or Get the provenance.
- Type
str
-
refresh
()¶ Refresh Resource model from API backend.
- Returns
- True, if it is able to refresh the model,
False otherwise.
- Return type
bool
-
stitch
(source_resources: List[Union[str, crux.models.file.File]], destination_resource: str, labels: Optional[str] = None, tags: Optional[List[str]] = None, description: Optional[str] = None) → Tuple[crux.models.file.File, str]¶ Method which stitches multiple Avro resources into single Avro resource
- Parameters
source_resources (
list
ofstr
orfile
) – List of resource paths which are to be stitched.destination_resource (str) – Resource Path to load the stitched output
labels (dict) – Key/Value labels that should be applied to stitched resource
tags (
list
ofstr
) – List of tags to be applied on destination resource. Taken into consideration if resource is required to be created.description (str) – Description to be applied created destination. Taken into consideration if resource is required to be created.
- Returns
- File object of destination resource.
Job ID for background running job.
- Return type
tuple (
crux.models.File
,str
)- Raises
TypeError – Source and Destination resource should be of type File or String
Gets the tags.
- Raises
TypeError – If tags is not a list
- Type
str
-
update
(name: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None) → bool¶ Updates the Dataset.
- Parameters
name (str) – Name of the dataset. Defaults to None.
description (str) – Description of the dataset. Defaults to None.
tags (
list
ofstr
) – List of tags. Defaults to None.
- Returns
True, if dataset is updated.
- Return type
bool
-
upload_file
(src: Union[IO, str], dest: str, media_type: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, only_use_crux_domains: Optional[bool] = None) → crux.models.file.File¶ Uploads the File.
- Parameters
src (str or file) – Local OS path whose content is to be uploaded to file resource.
dest (str) – File resource path.
media_type (str) – Content type of the file. Defaults to None.
description (str) – Description of the file. Defaults to None.
tags (
list
ofstr
) – Tags to be attached to the file resource.only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
File Object.
- Return type
-
upload_files
(local_path: str, folder: str, media_type: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, only_use_crux_domains: Optional[bool] = None) → List[crux.models.file.File]¶ Uploads the resources recursively.
- Parameters
local_path (str) – Local OS Path from where the file resources should be uploaded.
media_type (str) – Content Types of File resources to be uploaded. Defaults to None.
folder (str) – Crux Dataset Folder where file resources should be recursively uploaded.
description (str) – Description to be set on uploaded resources. Defaults to None.
tags (
list
ofstr
) – Tags to be set on uploaded resources. Defaults to None.only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
List of uploaded file objects.
- Return type
list (
crux.models.File
)- Raises
ValueError – If folder or local_path is None.
OSError – If local_path is an invalid directory location.
-
property
website
¶ Gets the Dataset Website.
- Type
str
-
crux.models.delivery module¶
Module contains Delivery model.
-
class
crux.models.delivery.
Delivery
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Delivery Model.
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
get_data
(file_format: str = 'avro/binary', use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get the processed delivery data
- Parameters
file_format (str) – File format of delivery.
use_cache (bool) – Preference to set cached response
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
get_healthlog
(use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get delivery healthlog information
- Parameters
use_cache (bool) – Preference to set cached response
- Returns
Healthlog Json Object.
- Return type
dict
-
get_raw
(use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get the raw delivery data
- Parameters
use_cache (bool) – Preference to set cached response
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
property
id
¶ Gets the Delivery ID.
- Type
str
-
property
ingestion_time
¶ Gets ingestion time of delivery.
- Type
str
-
property
schedule_datetime
¶ Gets schedule datetime of delivery.
- Type
str
-
property
status
¶ Gets the Status of delivery.
- Type
str
-
property
summary
¶ Gets the Delivery Summary
- Type
dict
-
property
crux.models.file module¶
Module contains File model.
-
class
crux.models.file.
File
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.resource.Resource
File Model.
-
download
(dest: str, chunk_size: int = 10485760, only_use_crux_domains: Optional[bool] = None) → bool¶ Downloads the file resource.
- Parameters
dest (str or file) – Local OS path at which file resource will be downloaded.
chunk_size (int) – Number of bytes to be read in memory.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
True if it is downloaded.
- Return type
bool
- Raises
TypeError – If dest is not a file like or string type.
-
iter_content
(chunk_size: int = 10485760, only_use_crux_domains: Optional[bool] = None) → Iterable[str]¶ Streams the file resource.
- Parameters
chunk_size (int) – Chunk Size for the stream.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Yields
bytes – Bytes of file resource.
- Raises
ValueError – If chunk_size is not multiple of 256 KiB.
-
upload
(src: Union[IO, str], media_type: Optional[str] = None, only_use_crux_domains: Optional[bool] = None) → crux.models.file.File¶ Uploads the content to empty file resource.
- Parameters
src (str or file) – Local OS path whose content is to be uploaded.
media_type (str) – Content type of the file. Defaults to None.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
File: File model object.
- Raises
TypeError – If src type is invalid.
-
crux.models.folder module¶
Module contains File model.
-
class
crux.models.folder.
Folder
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.resource.Resource
Folder Model.
-
add_permission
(identity_id: str, permission: str, recursive: bool = False) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the Folder resource.
- Parameters
identity_id (str) – Identity Id to be set.
permission (str) – Permission to be set.
recursive (bool) – If recursive is set to True, it will recursive apply permission to all resources under the folder resource.
- Returns
- If recursive is set then it returns True.
If recursive is unset then it returns Permission object.
- Return type
bool or crux.models.Permission
-
delete_permission
(identity_id: str, permission: str, recursive: bool = False) → bool¶ Deletes permission from Folder resource.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for deletion.
recursive (bool) – If recursive is set to True, it will recursively delete permission from all resources under the folder resource. Defaults to False.
- Returns
True if it is able to delete it.
- Return type
bool
-
crux.models.identity module¶
Module contains Identity model.
-
class
crux.models.identity.
Identity
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Identity Model.
-
property
company_name
¶ Gets the Company name.
- Type
str
-
property
description
¶ Gets the Description.
- Type
str
-
property
email
¶ Gets the Email.
- Type
str
-
property
first_name
¶ Gets the First name.
- Type
str
-
property
identity_id
¶ Gets the Identity Id.
- Type
str
-
property
landing_page
¶ Gets the Landing Page.
- Type
str
-
property
last_name
¶ Gets the Last name.
- Type
str
-
property
parent_identity_id
¶ Gets the Parent Identity Id.
- Type
str
-
property
phone
¶ Gets the phone.
- Type
str
-
property
role
¶ Gets the Role.
- Type
str
-
property
type
¶ Gets the Type.
- Type
str
-
property
website
¶ Gets the Website.
- Type
str
-
property
crux.models.ingestion module¶
Module contains Ingestion model.
-
class
crux.models.ingestion.
Ingestion
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Ingestion Model.
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
get_data
(version: Optional[int] = None, file_format: str = 'avro/binary') → Iterator[crux.models.resource.Resource]¶ Get the processed delivery data
- Parameters
version (int) – Version of the delivery.
file_format (str) – File format of delivery.
status (accepted) – List of acceptable statuses. Defaults to None.
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
get_raw
(version=None) → Iterator[crux.models.resource.Resource]¶ Get the raw delivery data
- Parameters
version (int) – Version of the delivery.
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
property
id
¶ Gets the Ingestion ID.
- Type
str
-
property
versions
¶ Gets the list of versions.
- Type
list
-
property
crux.models.job module¶
Module contains AbstractJob, Job, LoadJob Model.
-
class
crux.models.job.
AbstractJob
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
AbstractJob Model.
-
class
crux.models.job.
Job
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.job.AbstractJob
Job Model.
-
property
job_id
¶ Gets the Job Id.
- Type
str
-
property
statistics
¶ Gets the Job Statistics.
- Type
str
-
property
status
¶ Gets the Job Status.
- Type
str
-
property
-
class
crux.models.job.
StitchJob
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.job.AbstractJob
Stitch Job Model.
-
property
job_id
¶ Gets the Job Id.
- Type
str
-
property
status
¶ Gets the Job Status.
- Type
str
-
property
crux.models.label module¶
Module contains Label model.
-
class
crux.models.label.
Label
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Label Model.
-
property
label_key
¶ Gets the Label Key.
- Type
str
-
property
label_value
¶ Gets the Label Value.
- Type
str
-
property
crux.models.model module¶
Module defines abstract CruxModel.
-
class
crux.models.model.
CruxModel
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
object
Base Crux model.
-
property
connection
¶ API connection client.
- Type
CruxClient
-
classmethod
from_dict
(a_dict: Dict[str, Any], connection: Optional[crux._client.CruxClient] = None) → Any¶ Returns model instance created from raw model dict.
- Parameters
a_dict (dict) – Model dict.
connection (CruxClient) – Connection bbject. Defaults to None.
- Returns
Model instance.
- Return type
-
to_dict
() → Dict[str, Any]¶ Returns dict copy of raw model.
- Returns
Raw model dict.
- Return type
dict
-
to_str
() → str¶ Absract to_str method.
-
property
crux.models.permission module¶
Module contains Permission model.
-
class
crux.models.permission.
Permission
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Permission Model.
-
property
identity_id
¶ Gets the Identity ID.
- Type
str
-
property
permission_name
¶ Gets the Permission Name.
- Type
str
-
property
target_id
¶ Gets the Target ID.
- Type
str
-
property
crux.models.resource module¶
Module contains Resource model.
-
class
crux.models.resource.
MediaType
¶ Bases:
enum.Enum
MediaType Enumeration Model.
-
AVRO
= 'avro/binary'¶
-
CSV
= 'text/csv'¶
-
JSON
= 'application/json'¶
-
NDJSON
= 'application/x-ndjson'¶
-
PARQUET
= 'parquet/binary'¶
-
classmethod
detect
(file_name: str) → str¶ Detects the media_type from the file extension.
- Parameters
file_name (str) – Absolute or Relative Path of the file.
- Returns
MediaType extension.
- Return type
str
- Raises
LookupError – If file type is not supported.
-
-
class
crux.models.resource.
Resource
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Resource Model.
-
add_label
(label_key: str, label_value: str) → bool¶ Adds label to Resource.
- Parameters
label_key (str) – Label Key for Resource.
label_value (str) – Label Value for Resource.
- Returns
True if label is added, False otherwise.
- Return type
bool
-
add_labels
(labels_dict: dict) → bool¶ Adds multiple labels to Resource.
- Parameters
label_dict (dict) – Labels (key/value pairs) to add to the Resource.
- Returns
True if the labels were added, False otherwise.
- Return type
bool
-
add_permission
(identity_id: str, permission: str) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the resource.
- Parameters
identity_id – Identity Id to be set.
permission – Permission to be set.
- Returns
Permission Object.
- Return type
-
property
as_of
¶ Gets the as_of.
- Type
str
-
property
created_at
¶ Gets created_at.
- Type
str
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
delete
() → bool¶ Deletes Resource from Dataset.
- Returns
True if it is deleted.
- Return type
bool
-
delete_label
(label_key: str) → bool¶ Deletes label from Resource.
- Parameters
label_key (str) – Label Key for Resource.
- Returns
True if label is deleted, False otherwise.
- Return type
bool
-
delete_permission
(identity_id: str, permission: str) → bool¶ Deletes permission from the resource.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
- Returns
True if it is able to delete it.
- Return type
bool
-
property
description
¶ Gets the Resource Description.
- Type
str
-
property
folder
¶ Compute or Get the folder name.
- Type
str
-
property
folder_id
¶ Gets the Folder ID.
- Type
str
-
property
frame_id
¶ Gets the Frame ID.
- Type
str
-
property
id
¶ Gets the Resource ID.
- Type
str
-
property
ingestion_time
¶ Gets created_at.
- Type
str
-
property
labels
¶ Gets the Resource labels.
- Type
dict
-
list_permissions
() → List[crux.models.permission.Permission]¶ Lists the permission on the resource.
- Returns
List of Permission Objects.
- Return type
list (
crux.models.Permission
)
-
property
media_type
¶ Gets the Media type.
- Type
str
-
property
modified_at
¶ Gets modified_at.
- Type
str
-
property
name
¶ Gets the Resource Name.
- Type
str
-
property
path
¶ Compute or Get the resource path.
- Type
str
-
property
provenance
¶ Gets the Provenance.
- Type
dict
-
refresh
()¶ Refresh Resource model from API backend.
- Returns
- True, if it is able to refresh the model,
False otherwise.
- Return type
bool
-
property
size
¶ Gets the size.
- Type
int
-
property
storage_id
¶ Gets the Storage ID.
- Type
str
-
property
supplier_implied_dt
¶ Gets the supplier date.
- Type
str
Gets the Resource Tags.
- Type
list
ofstr
-
property
type
¶ Gets the Resource Type.
- Type
str
-
update
(name: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, provenance: Optional[str] = None) → bool¶ Updates the metadata for Resource.
- Parameters
name (str) – Name of resource. Defaults to None.
description (str) – Description of the resource. Defaults to None.
tags (
list
ofstr
) – List of tags. Defaults to None.provenance (str) – Provenance for a resource. Defaults to None.
- Returns
True, if resource is updated.
- Return type
bool
- Raises
ValueError – It is raised if name, description or tags are unset.
TypeError – It is raised if tags are not of type List.
-
Module contents¶
Module containing models that represent objects returned by the API.
-
class
crux.models.
Identity
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Identity Model.
-
property
company_name
¶ Gets the Company name.
- Type
str
-
property
description
¶ Gets the Description.
- Type
str
-
property
email
¶ Gets the Email.
- Type
str
-
property
first_name
¶ Gets the First name.
- Type
str
-
property
identity_id
¶ Gets the Identity Id.
- Type
str
-
property
landing_page
¶ Gets the Landing Page.
- Type
str
-
property
last_name
¶ Gets the Last name.
- Type
str
-
property
parent_identity_id
¶ Gets the Parent Identity Id.
- Type
str
-
property
phone
¶ Gets the phone.
- Type
str
-
property
role
¶ Gets the Role.
- Type
str
-
property
type
¶ Gets the Type.
- Type
str
-
property
website
¶ Gets the Website.
- Type
str
-
property
-
class
crux.models.
Permission
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Permission Model.
-
property
identity_id
¶ Gets the Identity ID.
- Type
str
-
property
permission_name
¶ Gets the Permission Name.
- Type
str
-
property
target_id
¶ Gets the Target ID.
- Type
str
-
property
-
class
crux.models.
StitchJob
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.job.AbstractJob
Stitch Job Model.
-
property
job_id
¶ Gets the Job Id.
- Type
str
-
property
status
¶ Gets the Job Status.
- Type
str
-
property
-
class
crux.models.
Job
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.job.AbstractJob
Job Model.
-
property
job_id
¶ Gets the Job Id.
- Type
str
-
property
statistics
¶ Gets the Job Statistics.
- Type
str
-
property
status
¶ Gets the Job Status.
- Type
str
-
property
-
class
crux.models.
Resource
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Resource Model.
-
add_label
(label_key: str, label_value: str) → bool¶ Adds label to Resource.
- Parameters
label_key (str) – Label Key for Resource.
label_value (str) – Label Value for Resource.
- Returns
True if label is added, False otherwise.
- Return type
bool
-
add_labels
(labels_dict: dict) → bool¶ Adds multiple labels to Resource.
- Parameters
label_dict (dict) – Labels (key/value pairs) to add to the Resource.
- Returns
True if the labels were added, False otherwise.
- Return type
bool
-
add_permission
(identity_id: str, permission: str) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the resource.
- Parameters
identity_id – Identity Id to be set.
permission – Permission to be set.
- Returns
Permission Object.
- Return type
-
property
as_of
¶ Gets the as_of.
- Type
str
-
property
created_at
¶ Gets created_at.
- Type
str
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
delete
() → bool¶ Deletes Resource from Dataset.
- Returns
True if it is deleted.
- Return type
bool
-
delete_label
(label_key: str) → bool¶ Deletes label from Resource.
- Parameters
label_key (str) – Label Key for Resource.
- Returns
True if label is deleted, False otherwise.
- Return type
bool
-
delete_permission
(identity_id: str, permission: str) → bool¶ Deletes permission from the resource.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
- Returns
True if it is able to delete it.
- Return type
bool
-
property
description
¶ Gets the Resource Description.
- Type
str
-
property
folder
¶ Compute or Get the folder name.
- Type
str
-
property
folder_id
¶ Gets the Folder ID.
- Type
str
-
property
frame_id
¶ Gets the Frame ID.
- Type
str
-
property
id
¶ Gets the Resource ID.
- Type
str
-
property
ingestion_time
¶ Gets created_at.
- Type
str
-
property
labels
¶ Gets the Resource labels.
- Type
dict
-
list_permissions
() → List[crux.models.permission.Permission]¶ Lists the permission on the resource.
- Returns
List of Permission Objects.
- Return type
list (
crux.models.Permission
)
-
property
media_type
¶ Gets the Media type.
- Type
str
-
property
modified_at
¶ Gets modified_at.
- Type
str
-
property
name
¶ Gets the Resource Name.
- Type
str
-
property
path
¶ Compute or Get the resource path.
- Type
str
-
property
provenance
¶ Gets the Provenance.
- Type
dict
-
refresh
()¶ Refresh Resource model from API backend.
- Returns
- True, if it is able to refresh the model,
False otherwise.
- Return type
bool
-
property
size
¶ Gets the size.
- Type
int
-
property
storage_id
¶ Gets the Storage ID.
- Type
str
-
property
supplier_implied_dt
¶ Gets the supplier date.
- Type
str
Gets the Resource Tags.
- Type
list
ofstr
-
property
type
¶ Gets the Resource Type.
- Type
str
-
update
(name: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, provenance: Optional[str] = None) → bool¶ Updates the metadata for Resource.
- Parameters
name (str) – Name of resource. Defaults to None.
description (str) – Description of the resource. Defaults to None.
tags (
list
ofstr
) – List of tags. Defaults to None.provenance (str) – Provenance for a resource. Defaults to None.
- Returns
True, if resource is updated.
- Return type
bool
- Raises
ValueError – It is raised if name, description or tags are unset.
TypeError – It is raised if tags are not of type List.
-
-
class
crux.models.
File
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.resource.Resource
File Model.
-
download
(dest: str, chunk_size: int = 10485760, only_use_crux_domains: Optional[bool] = None) → bool¶ Downloads the file resource.
- Parameters
dest (str or file) – Local OS path at which file resource will be downloaded.
chunk_size (int) – Number of bytes to be read in memory.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
True if it is downloaded.
- Return type
bool
- Raises
TypeError – If dest is not a file like or string type.
-
iter_content
(chunk_size: int = 10485760, only_use_crux_domains: Optional[bool] = None) → Iterable[str]¶ Streams the file resource.
- Parameters
chunk_size (int) – Chunk Size for the stream.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Yields
bytes – Bytes of file resource.
- Raises
ValueError – If chunk_size is not multiple of 256 KiB.
-
upload
(src: Union[IO, str], media_type: Optional[str] = None, only_use_crux_domains: Optional[bool] = None) → crux.models.file.File¶ Uploads the content to empty file resource.
- Parameters
src (str or file) – Local OS path whose content is to be uploaded.
media_type (str) – Content type of the file. Defaults to None.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
File: File model object.
- Raises
TypeError – If src type is invalid.
-
-
class
crux.models.
Folder
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.resource.Resource
Folder Model.
-
add_permission
(identity_id: str, permission: str, recursive: bool = False) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the Folder resource.
- Parameters
identity_id (str) – Identity Id to be set.
permission (str) – Permission to be set.
recursive (bool) – If recursive is set to True, it will recursive apply permission to all resources under the folder resource.
- Returns
- If recursive is set then it returns True.
If recursive is unset then it returns Permission object.
- Return type
bool or crux.models.Permission
-
delete_permission
(identity_id: str, permission: str, recursive: bool = False) → bool¶ Deletes permission from Folder resource.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for deletion.
recursive (bool) – If recursive is set to True, it will recursively delete permission from all resources under the folder resource. Defaults to False.
- Returns
True if it is able to delete it.
- Return type
bool
-
-
class
crux.models.
Dataset
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Dataset Model.
-
add_label
(label_key: str, label_value: str) → bool¶ Adds label to Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
label_value (str) – Label Value for Dataset.
- Returns
True if labels are added.
- Return type
bool
-
add_permission
(identity_id: str, permission: str) → Union[bool, crux.models.permission.Permission]¶ Adds permission to the Dataset.
- Parameters
identity_id – Identity Id to be set.
permission – Permission to be set.
- Returns
Permission Object.
- Return type
-
add_permission_to_resources
(identity_id: str, permission: str, resource_paths: Optional[List[str]] = None, resource_objects: Optional[List[Union[crux.models.file.File, crux.models.folder.Folder]]] = None, resource_ids: Optional[List[str]] = None) → bool¶ Adds permission to all or specific Dataset resources.
- Parameters
identity_id (str) – Identity Id to be set.
permission (str) – Permission to be set.
resource_paths (
list
ofstr
) – List of resource paths on which the permission should be applied. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.resource_objects (
list
ofcrux.models.Resource
) – List of resource objects on which the permission should be applied. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.resource_ids (
list
ofstr
) – List of resource ids on which permission should be applied. Overrides resource_pathss and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will apply the permission to whole dataset.
- Returns
True if permission is applied.
- Return type
bool
-
property
contact_identity_id
¶ Gets the Contact Identity ID.
- Type
str
-
create
() → bool¶ Creates the Dataset.
- Returns
True if dataset is created.
- Return type
bool
-
create_file
(path: str, tags: Optional[List[str]] = None, description: Optional[str] = None) → crux.models.file.File¶ Creates File resource in Dataset.
- Parameters
path (str) – Path of the file resource.
tags (
list
ofstr
) – Tags of the file resource. Defaults to None.description (str) – Description of the file resource.
- Returns
File Object.
- Return type
-
create_folder
(path: str, folder: str = '/', tags: Optional[List[str]] = None, description: Optional[str] = None) → crux.models.folder.Folder¶ Creates Folder resource in Dataset.
- Parameters
path (str) – Path of the Folder resource.
folder (str) – Parent folder of the Folder resource. Defaults to /.
tags (
list
ofstr
) – Tags of the Folder resource. Defaults to None.description (str) – Description of the Folder resource. Defaults to None.
- Returns
Folder Object.
- Return type
-
property
created_at
¶ Gets the Dataset created_at.
- Type
str
-
delete
() → bool¶ Deletes the Dataset.
- Returns
True if dataset is deleted.
- Return type
bool
-
delete_label
(label_key: str) → bool¶ Deletes label from Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
- Returns
True if labels are deleted.
- Return type
bool
-
delete_permission
(identity_id: str, permission: str) → bool¶ Deletes permission from the Dataset.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
- Returns
True if it is able to delete it.
- Return type
bool
-
delete_permission_from_resources
(identity_id: str, permission: str, resource_paths: Optional[List[str]] = None, resource_objects: Optional[List[Union[crux.models.file.File, crux.models.folder.Folder]]] = None, resource_ids: Optional[List[str]] = None) → bool¶ Method which deletes permission from all or specific Dataset resources.
- Parameters
identity_id (str) – Identity Id for the deletion.
permission (str) – Permission for the deletion.
resource_paths (
list
ofcrux.models.Resource
) – List of resource path from which the permission should be deleted. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.resource_objects (
list
ofcrux.models.Resource
) – List of resource objects from which the permission should be deleted. Overrides resource_paths. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.resource_ids (
list
ofcrux.models.Resource
) – List of resource ids from which the permission should be deleted. Overrides resource_paths and resource_objects. If none of resource_paths, resource_objects or resource_ids parameter is set, then it will delete the permission from whole dataset.
- Returns
True if it is able to delete the permission.
- Return type
bool
-
property
description
¶ Gets the Dataset Description.
- Type
str
-
download_files
(folder: str, local_path: str, only_use_crux_domains: Optional[bool] = None) → List[str]¶ Downloads the resources recursively.
- Parameters
folder (str) – Crux Dataset Folder from where the file resources should be recursively downloaded.
local_path (str) – Local OS Path where the file resources should be downloaded.
only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
List of location of download files.
- Return type
list (
str
)- Raises
ValueError – If Folder or local_path is None.
OSError – If local_path is an invalid directory location.
-
find_resources_by_label
(predicates: List[Dict[str, str]], max_per_page: int = 1000) → Iterator[Union[crux.models.file.File, crux.models.folder.Folder]]¶ Method which searches the resouces for given labels in Dataset
Each predicate can be either:
Lexicographical equal
Lexicographical less than
Lexicographical less than or equal to
Lexicographical greater than
Lexicographical greater than or equal to
A list of OR predicates
A list of AND predicates
predicates = [ {"op": "eq", "key": "key1", "val": "abcd"}, {"op": "ne", "key": "key1", "val": "zzzz"}, {"op": "lt", "key": "key1", "val": "abd"}, {"op": "gt", "key": "key1", "val": "abc"}, {"op": "lte", "key": "key1", "val": "abd"}, {"op": "gte", "key": "key1", "val": "abc"}, {"op": "or", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more OR predicates... ] }, {"op": "and", "in": [ {"op": "eq", "key": "key1", "val": "abcd"}, # more AND predicates... ] } ]
- Parameters
predicates (
list
ofdict
) – List of dictionary predicates for finding resources.max_per_page (int) – Pagination limit. Defaults to 1000.
- Returns
List of resource matching the query parameters.
- Return type
list (
crux.models.Resource
)
Example
from crux import Crux conn = Crux() dataset_object = conn.get_dataset(id="dataset_id") predicates=[ {"op":"eq","key":"test_label1","val":"test_value1"} ] resource_objects = dataset_object.find_resources_by_label( predicates=predicates )
-
get_delivery
(delivery_id: str) → crux.models.delivery.Delivery¶ Gets Delivery object.
- Parameters
delivery_id (str) – Delivery ID.
- Returns
Delivery Object.
- Return type
- Raises
ValueError – If delivery_id value is invalid.
-
get_file
(path: str) → crux.models.file.File¶ Gets the File resource object.
- Parameters
path (str) – File resource path.
- Returns
File Object.
- Return type
-
get_files_range
(start_date: Union[datetime.datetime, str], end_date: Union[datetime.datetime, str, None] = None, frames: Union[str, list, None] = None, file_format: str = 'avro/binary', dayfirst: bool = False, yearfirst: bool = False, latest_only: bool = False, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.file.File]¶ Get a set of dataset file resources. The best single delivery version for each supplier_implied_dt is selected for the given time range.
- Parameters
start_date (str) – ISO format start datetime or any paresable date string
end_date (str) – ISO format end datetime or any parseable date string
delivery_status (str) – Delivery status enum
frames (str, list) – filter for selected frames
file_format (str) – File format of delivery
dayfirst (str) – Parse format for str date
yearfirst (str) – Parse format for str date
latest_only (bool) – Return latest files only
delivery_status – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_folder
(path: str) → crux.models.folder.Folder¶ Gets the Folder resource object.
- Parameters
path (str) – Folder resource path.
- Returns
Folder Object.
- Return type
-
get_ingestions
(start_date: Optional[str] = None, end_date: Optional[str] = None, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.ingestion.Ingestion]¶ Gets Ingestions.
- Parameters
start_date (str) – ISO format start time.
end_date (str) – ISO format end time.
delivery_status (str) – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
Delivery Object.
- Return type
-
get_label
(label_key: str) → crux.models.label.Label¶ Gets label value of Dataset.
- Parameters
label_key (str) – Label Key for Dataset.
- Returns
Label Object.
- Return type
-
get_latest_files
(frames: Union[str, List, None] = None, file_format: str = 'avro/binary', cutoff_date: Optional[str] = None, dayfirst: bool = False, yearfirst: bool = False, delivery_status: Optional[str] = None, use_cache: Optional[bool] = None) → Iterator[crux.models.file.File]¶ Get the latest dataset file resources. The latest supplier_implied_dt with the best single delivery version for each frame is selected.
- Parameters
frames (str, list) – filter for selected frames
file_format (str) – File format of delivery
cutoff_date (datetime, str) – Search up to this date
dayfirst (str) – Parse format for str date
yearfirst (str) – Parse format for str date
delivery_status (str) – Delivery status enum
use_cache (bool) – Preference to set cached response
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_resources_batch
(resource_ids: list) → Iterator[crux.models.file.File]¶ Gets resource metadata.
- Parameters
resource_ids (list) – List of resource IDs
- Returns
List of file resources.
- Return type
list (
crux.models.File
)
-
get_stitch_job
(job_id: str) → crux.models.job.StitchJob¶ Stitch Job Details.
- Parameters
job_id (str) – Job ID of the Stitch Job.
- Returns
StitchJob object.
- Return type
-
property
id
¶ Gets the Dataset ID.
- Type
str
-
list_files
(sort: Optional[str] = None, folder: str = '/', cursor: Optional[str] = None, limit: int = 100) → List[crux.models.file.File]¶ Lists the files.
- Parameters
sort (str) – Sets whether to sort or not. Defaults to None.
folder (str) – Folder for which resource should be listed. Defaults to /.
cursor (str) – Sets the offset to the page cursor. Defaults to None.
limit (int) – Sets the limit. Defaults to 100.
- Returns
List of File objects.
- Return type
list (
crux.models.File
)
-
list_permissions
() → List[crux.models.permission.Permission]¶ Lists the permission on the Dataset.
- Returns
List of Permission Objects.
- Return type
list (
crux.models.Permission
)
-
list_resources
(folder: str = '/', cursor: str = None, limit: int = 1, include_folders: bool = False, sort: str = None) → Generator[Resource]¶ Lists the resources in Dataset.
- Parameters
folder (str) – Folder for which resource should be listed. Defaults to /.
cursor (str) – Sets the offset to the page cursor. Defaults to None.
limit (int) – Sets the limit. Defaults to 1.
include_folders (bool) – Sets whether to include folders or not. Defaults to False.
sort (str) – Sets whether to sort or not. Defaults to None.
- Returns
List of File resource objects.
- Return type
list (
crux.models.Resource
)
-
property
modified_at
¶ Gets the Dataset modified_at.
- Type
str
-
property
name
¶ Gets the Dataset Name.
- Type
str
-
property
owner_identity_id
¶ Gets the Owner Identity ID.
- Type
str
-
property
provenance
¶ Compute or Get the provenance.
- Type
str
-
refresh
()¶ Refresh Resource model from API backend.
- Returns
- True, if it is able to refresh the model,
False otherwise.
- Return type
bool
-
stitch
(source_resources: List[Union[str, crux.models.file.File]], destination_resource: str, labels: Optional[str] = None, tags: Optional[List[str]] = None, description: Optional[str] = None) → Tuple[crux.models.file.File, str]¶ Method which stitches multiple Avro resources into single Avro resource
- Parameters
source_resources (
list
ofstr
orfile
) – List of resource paths which are to be stitched.destination_resource (str) – Resource Path to load the stitched output
labels (dict) – Key/Value labels that should be applied to stitched resource
tags (
list
ofstr
) – List of tags to be applied on destination resource. Taken into consideration if resource is required to be created.description (str) – Description to be applied created destination. Taken into consideration if resource is required to be created.
- Returns
- File object of destination resource.
Job ID for background running job.
- Return type
tuple (
crux.models.File
,str
)- Raises
TypeError – Source and Destination resource should be of type File or String
Gets the tags.
- Raises
TypeError – If tags is not a list
- Type
str
-
update
(name: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None) → bool¶ Updates the Dataset.
- Parameters
name (str) – Name of the dataset. Defaults to None.
description (str) – Description of the dataset. Defaults to None.
tags (
list
ofstr
) – List of tags. Defaults to None.
- Returns
True, if dataset is updated.
- Return type
bool
-
upload_file
(src: Union[IO, str], dest: str, media_type: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, only_use_crux_domains: Optional[bool] = None) → crux.models.file.File¶ Uploads the File.
- Parameters
src (str or file) – Local OS path whose content is to be uploaded to file resource.
dest (str) – File resource path.
media_type (str) – Content type of the file. Defaults to None.
description (str) – Description of the file. Defaults to None.
tags (
list
ofstr
) – Tags to be attached to the file resource.only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
File Object.
- Return type
-
upload_files
(local_path: str, folder: str, media_type: Optional[str] = None, description: Optional[str] = None, tags: Optional[List[str]] = None, only_use_crux_domains: Optional[bool] = None) → List[crux.models.file.File]¶ Uploads the resources recursively.
- Parameters
local_path (str) – Local OS Path from where the file resources should be uploaded.
media_type (str) – Content Types of File resources to be uploaded. Defaults to None.
folder (str) – Crux Dataset Folder where file resources should be recursively uploaded.
description (str) – Description to be set on uploaded resources. Defaults to None.
tags (
list
ofstr
) – Tags to be set on uploaded resources. Defaults to None.only_use_crux_domains (bool) – True if content is required to be downloaded from Crux domains else False.
- Returns
List of uploaded file objects.
- Return type
list (
crux.models.File
)- Raises
ValueError – If folder or local_path is None.
OSError – If local_path is an invalid directory location.
-
property
website
¶ Gets the Dataset Website.
- Type
str
-
-
class
crux.models.
Label
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Label Model.
-
property
label_key
¶ Gets the Label Key.
- Type
str
-
property
label_value
¶ Gets the Label Value.
- Type
str
-
property
-
class
crux.models.
Delivery
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Delivery Model.
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
get_data
(file_format: str = 'avro/binary', use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get the processed delivery data
- Parameters
file_format (str) – File format of delivery.
use_cache (bool) – Preference to set cached response
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
get_healthlog
(use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get delivery healthlog information
- Parameters
use_cache (bool) – Preference to set cached response
- Returns
Healthlog Json Object.
- Return type
dict
-
get_raw
(use_cache: Optional[bool] = None) → Iterator[crux.models.resource.Resource]¶ Get the raw delivery data
- Parameters
use_cache (bool) – Preference to set cached response
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
property
id
¶ Gets the Delivery ID.
- Type
str
-
property
ingestion_time
¶ Gets ingestion time of delivery.
- Type
str
-
property
schedule_datetime
¶ Gets schedule datetime of delivery.
- Type
str
-
property
status
¶ Gets the Status of delivery.
- Type
str
-
property
summary
¶ Gets the Delivery Summary
- Type
dict
-
property
-
class
crux.models.
Ingestion
(raw_model: Optional[Dict] = None, connection: Optional[crux._client.CruxClient] = None)¶ Bases:
crux.models.model.CruxModel
Ingestion Model.
-
property
dataset_id
¶ Gets the Dataset ID.
- Type
str
-
get_data
(version: Optional[int] = None, file_format: str = 'avro/binary') → Iterator[crux.models.resource.Resource]¶ Get the processed delivery data
- Parameters
version (int) – Version of the delivery.
file_format (str) – File format of delivery.
status (accepted) – List of acceptable statuses. Defaults to None.
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
get_raw
(version=None) → Iterator[crux.models.resource.Resource]¶ Get the raw delivery data
- Parameters
version (int) – Version of the delivery.
- Returns
List of resources.
- Return type
list (
crux.models.Resource
)
-
property
id
¶ Gets the Ingestion ID.
- Type
str
-
property
versions
¶ Gets the list of versions.
- Type
list
-
property