Roles & Responsibilities:
- Design and build data quality rules, checks, and validation processes using SQL, Python, or other scripting languages.
- Collaborate with data engineers, analysts, and subject matter experts to understand data requirements and establish data quality dimensions.
- Identify data quality issues, determine root causes, and implement solutions to cleanse, standardize, and enrich data.
- Automate data quality checks and integrate them into data pipelines and ETL processes.
- Establish data profiling and monitoring mechanisms to continuously assess data quality and identify anomalies or deviations.
- Test pipelines to ensure consistent quality and values of data from source system to target systems.
- Develop and implement data quality standards, processes, and metrics to measure and monitor data quality across the organization.
- Monitor data pipelines for failures or errors and work with data engineers and warehouse engineers to implement fixes in timely manner.
- Implement data governance policies and procedures to ensure data consistency, traceability, and adherence to regulatory and compliance requirements.
- Develop and maintain documentation, including data quality rules, processes, and issue resolution guides.
- Stay up-to-date with industry best practices, tools, and techniques related to data quality management.
Experience & Skills Fitment:
- Bachelor's degree in Computer Science, Information Systems, Statistics, or a related field, or equivalent practical experience.
- 5-7 years’ experience in designing, implementing, monitoring, and maintaining data quality processes and frameworks.
- Strong proficiency in SQL and programming languages like Python or Scala.
- Experience with data quality tools and frameworks (e.g., Talend Data Quality, AWS Glue DataBrew, Azure Data Quality Services).
- Solid understanding of data modeling, database design, and data warehousing principles.
- Knowledge of data integration and ETL processes, as well as experience with tools like Apache Airflow or AWS Data Pipeline.
- Familiarity with data governance and data management best practices.
- Strong analytical and problem-solving skills, with the ability to identify and resolve complex data quality issues.
- Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams and stakeholders.
Good to have:
- Master's degree in a relevant field or equivalent experience.
- Experience with big data technologies like Apache Spark, Hadoop, or Kafka.
- Knowledge of machine learning techniques for data quality management.
- Familiarity with cloud platforms like AWS, Azure, or GCP.
- Certification in relevant technologies (e.g., Certified Data Quality Professional, AWS Certified Data Analytics - Specialty).
Benefits:
- Kloud9 provides a robust compensation package and a forward-looking opportunity for growth in emerging fields.
Equal Opportunity Employer:
- Kloud9 is an equal opportunity employer and will not discriminate against any employee or applicant on the basis of age, color, disability, gender, national origin, race, religion, sexual orientation, veteran status, or any classification protected by federal, state, or local law.