Data administration in KeyCore Enterprise Data Lake (KEDL) is a crucial aspect of ensuring smooth data operations and maintaining data integrity within the organization. In this article, we will explore the different user types in KEDL, the role of stewards in approving access to data products, compliance checks for self-service datasets, and the creation and management of user profiles.
1. User Types in KEDL In KEDL, there are three main types of users, each with specific roles and responsibilities:
Owner: Owners are users who have created the datasets in KEDL. They hold administrative privileges to manage datasets, assign stewards, and approve user access requests. As dataset creators, owners have a pivotal role in governing data products within the Data Lake.
Steward: Stewards are responsible for reviewing and approving user access requests to specific datasets. They act as gatekeepers, ensuring that only authorized users gain access to sensitive data. Stewards play a critical role in maintaining data security and compliance.
Member: Members are the users who actively work with the data products in the Data Lake. They can be assigned different profiles, such as Consumer or Developer, based on their roles and responsibilities. A Consumer typically has read-only access to data, while a Developer has write access to different stages of the data flow.
2. Dataset Compliance and Self-Service Classification In KEDL, determining whether a dataset is self-service or not relies on compliance checks and user-defined classifications. Compliance checks encompass factors like GDPR, GXP, PII, etc., which help in assessing the sensitivity and security requirements of the dataset.
By specifying compliance attributes and creating customer-defined classifications, dataset owners can categorize datasets as either self-service or not. Self-service datasets allow users to access the data without requiring steward approval, streamlining data accessibility for approved users.
As data governance evolves, KEDL is exploring the possibility of introducing a Domain Specific Language (DSL) to express more intricate compliance relations, catering to the dynamic needs of data administration.
3. User Profiles and Data Access Capabilities KEDL enables the creation of user profiles, defining the capabilities and access permissions for members. The profile types include Consumer, Developer, Analyst, and more, and are stored in the SSM Parameter Store on the main account.
A Consumer profile typically has limited access and can only read data from the final stage of the data flow. In contrast, a Developer enjoys broader access, including write privileges to various stages of the data flow. Additional profiles can be easily added to accommodate evolving data access requirements.
4. Data Product Search and Reporting KEDL incorporates a search functionality, allowing all users to search for datasets based on tags. Tags can include default parameters like the dataset owner or dataset name, but you can also add your own tags special to the dataset, simplifying the process of locating specific datasets.
Furthermore, KEDL envisions the possibility of generating reports based on data usage and dataset creation. Such reports could prove valuable for managers and administrators seeking insights into data utilization patterns and trends across the organization.
Data administration is the foundation of an efficient and secure data management ecosystem. In KEDL, user roles, compliance checks, and user profiles contribute to the smooth functioning of the Data Lake, ensuring data is accessible, protected, and utilized effectively by the right stakeholders. The following articles will delve into dataset creation, user management, and other aspects of utilizing KEDL in your daily data operations.