Managed Airflow Platform (MAP) Support Engineer

Roles & Responsibilities:

As a Senior Engineer with a focus on Managed Airflow Platform (MAP) support engineering, you will:

  • Evangelize and cultivate adoption of Global Platforms, open-source software and agile principles within the organization.
  • Ensure solutions are designed and developed using a scalable, highly resilient cloud native architecture.
  • Ensure the operational stability, performance, and scalability of cloud-native platforms through proactive monitoring and timely issue resolution.
  • Diagnose infrastructure and system issues across cloud environments and Kubernetes clusters, and lead efforts in troubleshooting and remediation.
  • Collaborate with engineering and infrastructure teams to manage configurations, resource tuning, and platform upgrades without disrupting business operations.
  • Maintain clear, accurate runbooks, support documentation, and platform knowledge bases to enable faster onboarding and incident response.
  • Support observability initiatives by improving logging, metrics, dashboards, and alerting frameworks.
  • Advocate for operational excellence and drive continuous improvement in system reliability, cost-efficiency, and maintainability.
  • Work with product management to support product / service scoping activities.
  • Work with leadership to define delivery schedules of key features through an agile framework.
  • Be a key contributor to overall architecture, framework and design of global platforms.

Experience & Skills Fitment:

  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • 3+ years of experience in large-scale production-grade platform support, including participation in on-call rotations.
  • 3+ years of hands-on experience with cloud platforms like AWS, Azure, or GCP.
  • 2+ years of experience developing and supporting data pipelines using Apache Airflow including DAG lifecycle management and scheduling best practices.
  • Troubleshooting task failures, scheduler issues, performance bottlenecks managing and error handling.
  • Strong programming proficiency in Python, especially for developing and troubleshooting RESTful APIs.
  • 1+ years of experience in observability using the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Stack.
  • 2+ years of experience with DevOps and Infrastructure-as-Code tools such as GitHub, Jenkins, Docker, and Terraform.
  • 2+ years of hands-on experience with Kubernetes, including managing and debugging cluster resources and workloads within Amazon EKS.
  • Exposure to Agile and test-driven development a plus.
  • Experience delivering projects in a highly collaborative, multi-disciplined development team environment.

Good to Have:

  • Exposure to Agile, ideally a strong background with the SAFe methodology.
  • Working knowledge of Node.js is considered an added advantage.
  • Skill set on any monitoring or observability tool will be a value add.

Benefits:

  • Kloud9 provides a robust compensation package and a forward-looking opportunity for growth in emerging fields.

Equal Opportunity Employer:

  • Kloud9 is an equal opportunity employer and will not discriminate against any employee or applicant on the basis of age, color, disability, gender, national origin, race, religion, sexual orientation, veteran status, or any classification protected by federal, state, or local law.

Resumes to be sent  to: recruitment@kloud9.nyc

Apply Online

Max file size 10MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.