Identify relevant dataset
This task involves identifying the dataset that needs to be validated. It plays a crucial role in ensuring that the data validation process is accurate and comprehensive. By identifying the relevant dataset, you can focus on validating the specific data that is important for your process. The desired result of this task is to have a clear understanding of the dataset that needs to be validated, minimizing any confusion or errors. To complete this task, you may need access to databases, files, or other sources where the dataset is stored.
Import data into appropriate software
In this task, you will import the identified dataset into the appropriate software for data validation. This task is essential for ensuring that the data is processed correctly and can be analyzed effectively. The desired result is to have the dataset imported and ready for validation. To complete this task, you will need access to the software or tools required for data importation.
Check for completeness of data
This task involves checking the completeness of the imported data. It is crucial to ensure that all the required fields and information are present in the dataset. The desired result is to identify any missing data and take appropriate actions to address them. To complete this task, you will need to review the dataset and compare it against the expected data structure or requirements.
Identify missing data
In this task, you will identify and document any missing data found during the completeness check. This step is important to ensure that all necessary information is available for further analysis and decision-making. The desired result is to have a clear list of missing data elements that need to be addressed. To complete this task, you may need to cross-reference the dataset with external sources or consult with relevant stakeholders.
Record count of missing data
This task involves documenting the count of missing data identified in the previous step. Keeping track of the number of missing data points is essential for evaluating the overall data quality and determining the impact on subsequent analysis or processes. The desired result is to have an accurate count of missing data. To complete this task, you can use a numbers field to record the count.
Check data for duplicates
In this task, you will review the dataset to identify any duplicate entries. Duplicate data can negatively impact analysis and decision-making processes, leading to inaccurate results. The desired result is to identify and remove or handle any duplicate data found. This task helps ensure data integrity and reliability. To complete this task, you will need to use appropriate tools or techniques for duplicate detection.
Record count of duplicate data
This task involves documenting the count of duplicate data entries found in the previous step. Keeping track of the number of duplicate data points helps evaluate the overall data quality and assess the impact on subsequent analysis or processes. The desired result is to have an accurate count of duplicate data. To complete this task, you can use a numbers field to record the count.
Check data for consistency
In this task, you will review the dataset to ensure data consistency. Data consistency is crucial for accurate analysis and decision-making. The desired result is to identify any inconsistencies or discrepancies in the data. To complete this task, you need to evaluate the dataset against predefined rules, standards, or expectations.
Make needed corrections to inconsistent data
This task involves making corrections to any inconsistent data found during the previous step. Inconsistent data can lead to erroneous analysis and conclusions. The desired result is to have consistent and accurate data. To complete this task, you may need to apply data cleaning techniques, consult with the data source, or verify against external references.
Approval: Consistency Check
-
Check data for consistency
Will be submitted
-
Make needed corrections to inconsistent data
Will be submitted
Validate values in the data set
In this task, you will validate the values in the dataset to ensure they meet the required criteria or standards. Validating data values helps maintain data integrity and supports reliable analysis and decision-making. The desired result is to have validated and trustworthy data. To complete this task, you need to define the validation criteria or rules and apply them to the dataset.
Check for logical errors in the data
This task involves reviewing the dataset for logical errors. Logical errors can lead to incorrect interpretations or conclusions based on the data. The desired result is to identify and resolve any logical errors found. To complete this task, you need to analyze the dataset for inconsistencies, illogical relationships, or contradictions.
Verify data accuracy
This task involves verifying the accuracy of the data. Data accuracy is critical for reliable analysis and decision-making. The desired result is to have accurate and trustworthy data. To complete this task, you need to compare the dataset with reliable sources, perform data reconciliation, or conduct validation checks.
Approval: Accuracy Verification
-
Validate values in the data set
Will be submitted
-
Check for logical errors in the data
Will be submitted
-
Ensure data is in the correct format
Will be submitted
-
Verify data accuracy
Will be submitted
Record any identified issues
In this task, you will document any issues or problems identified during the data validation process. Recording identified issues helps track and communicate potential data quality problems and their impact. The desired result is to have a comprehensive list of identified issues. To complete this task, you can use a longText field to record the issues.
Implement corrective action for identified issues
This task involves implementing appropriate corrective actions for the identified data quality issues. Corrective actions aim to resolve or mitigate the impact of the identified issues. The desired result is to have the necessary steps taken to address the identified issues. To complete this task, you may need to consult with relevant stakeholders, apply data cleaning techniques, or update data sources.
Re-validate corrected data
In this task, you will re-validate the corrected data to ensure that the implemented corrective actions have resolved the identified issues. Re-validating the data helps confirm its accuracy and reliability. The desired result is to have the corrected data re-validated and ready for further analysis or processing. To complete this task, you need to repeat the validation steps applied earlier.
Final data validation report generation
This task involves generating a final data validation report summarizing the results, findings, and actions taken during the entire data validation process. The report provides a comprehensive overview of the data quality and the effectiveness of the validation efforts. The desired result is to have a well-documented and accessible data validation report. To complete this task, you may need to use appropriate reporting tools or templates.
Approval: Report Generation
-
Final data validation report generation
Will be submitted