Data ownership and stewardship is all about knowing who's responsible for looking after different types of data in your organization. It's important to have a clear understanding of who owns what data, what obligations come with that ownership, and to review this regularly to make sure everything stays up to date.
Where did this come from?
This control comes from the CSA Cloud Controls Matrix v4.0.10 - 2023-09-26. You can download the full document from https://cloudsecurityalliance.org/artifacts/cloud-controls-matrix-v4. The Cloud Controls Matrix provides a comprehensive set of security controls that are relevant for cloud computing environments.
For more background, check out the AWS Whitepaper on the Shared Responsibility Model which talks about how security responsibilities are divided between AWS and the customer: https://aws.amazon.com/compliance/shared-responsibility-model/
Who should care?
This is most relevant for:
- Information Security Managers with responsibility for data governance
- Compliance Officers with obligations around data protection regulations
- Data Owners who are accountable for specific datasets
- Data Stewards who have day-to-day responsibility for managing data
What is the risk?
Without clear data ownership and stewardship, some key risks include:
- Data being misused, modified or deleted inappropriately because no one is actively managing it
- Non-compliance with data protection regulations because responsibilities are unclear
- Security gaps appearing over time as data estate changes but ownership records aren't updated Having a well defined and executed data ownership & stewardship process helps to mitigate these risks significantly. It ensures there are named individuals responsible for securing and handling data appropriately on an ongoing basis.
What's the care factor?
For organizations with large, complex and sensitive data estates, this is a critical control. The risks of non-compliance, data breaches and operational disruption are high without it.
However for smaller organizations or those with relatively simple data estates, a full-blown data ownership program may be overkill. A lightweight process with a few clearly defined roles may suffice.
The key thing is to understand your data and compliance obligations, and put in place a proportionate approach to manage the associated risks.
When is it relevant?
Data ownership and stewardship is important when:
- You are responsible for personal data or other sensitive information
- You have a complex data estate with many different types of data and storage locations
- You operate in a regulated industry with strict data protection requirements
- Proper data handling is critical to your business operations
It may be less relevant if:
- You have a simple, small scale data estate
- You don't handle personal or sensitive data
- Regulations and contractual requirements are minimal
- Impact of data issues is relatively low
What are the tradeoffs?
Implementing data ownership and stewardship does come with some costs:
- Staff time to define the process, assign owners and conduct reviews
- Potential friction with data consumers if access needs to be restricted
- Effort required from data owners to properly manage and secure their data However, these are usually far outweighed by the benefits of reduced risk, smoother compliance and greater confidence in your data.
How to make it happen?
Here's a step-by-step guide to implementing data ownership and stewardship:
- Catalog your data: Create an inventory of all the personal and sensitive data in your organization. Record what the data is, where it's stored, and the associated regulatory/contractual obligations. AWS Glue and AWS Data Catalog can help automate this.
- Define roles & responsibilities: Establish clear definitions for data owners and stewards. Data owners have overall accountability for a dataset, while stewards handle the day-to-day management. Document these roles in a data governance policy.
- Assign owners & stewards: For each dataset in your catalog, assign a named owner and steward. Record these assignments in a data responsibility matrix, including the data type, obligations, owner, steward and review frequency for each row.
- Implement access controls: Put technical controls in place to enforce appropriate access to each dataset based on the decisions of the owner. Tools like AWS IAM, Amazon Macie and AWS Lake Formation can help with this. Stewards should regularly review permissions.
- Train & communicate: Make sure data owners and stewards understand their responsibilities. Communicate the data responsibility matrix and governance processes to all relevant stakeholders.
- Conduct regular reviews: At least annually, conduct a review of the data responsibility matrix with owners & stewards. Update it to reflect any changes in your data estate, regulatory obligations or organizational structure.
- Monitor & measure: Establish KPIs to track the effectiveness of your data ownership processes. Monitor things like % of data with assigned owners, frequency of permission reviews, # of unauthorized access attempts, etc. Use this to drive continuous improvement.
What are some gotchas?
A few things to watch out for:
- Owners must have sufficient authority and resources to execute their responsibilities effectively. Make sure they are senior enough and have the necessary tools & support.
- Stewards should have the right domain and technical skills for the data they manage. They may need training on data management best practices and tools.
- Access control rules must be carefully defined to balance security and usability. Overly restrictive rules can block legitimate data usage. Tools like AWS Lake Formation can help define granular, attribute based access control.
- Conducting access reviews can be time consuming, especially for large datasets. Consider automating access monitoring with Amazon CloudWatch and triggering reviews based on drift.
- When stakeholders change roles or leave the organization, make sure to update the responsibility matrix. Implement a formal handover process to maintain continuity.
What are the alternatives?
For smaller organizations, a simplified version of data ownership may work where there is a single owner for all data, or responsibility is divided at a high level by business function.
Some organizations may rely on a central data management function to handle stewardship rather than defining individual stewards for each dataset.
Automated data discovery & classification tools like Amazon Macie can reduce reliance on manual processes and ownership assignment.
Explore further
?