Key responsibilities include:
• Design, deploy, and maintain scalable, secure, and highly available cloud infrastructure on cloud platforms such as Google Cloud Platform (GCP) and AWS.
• Implement and manage monitoring, logging, and alerting systems to proactively identify and address issues in the cloud environment.
• Develop and automate deployment processes for efficient infrastructure provisioning and configuration management of cloud resources.
• Work closely with platform engineers to integrate cloud infrastructure with CI/CD pipelines and deployment workflows.
• Collaborate with data engineers to optimize data pipelines for performance, reliability, and cost-effectiveness.
• Conduct regular performance tunning and capability planning to ensure the optimal utilization of cloud resources.
• Participate in incident response and troubleshooting for production issues in data pipelines and API services.
• Ensure compliance with industry standards and best practices for data security and regulatory requirements.
• Stay updated on emerging cloud technologies and best practices and evaluate their potential impact and application in our systems and processes.
• Provide technical leadership and mentorship to junior team members. Forster a culture of knowledge sharing and continuous learning.
Minimum Requirements:
• Bachelor's degree in Computer Science, Engineering, relevant job experience with 7 plus years
• Significant experience in cloud infrastructure management, preferably in a data-intensive environment.
• Strong proficiency in cloud platforms such as GCP or AWS, including services like Bigquery, Aurora, GCS/S3, GCE/EC2, Cloud Functions/Lambda, Pub/Sub/SQS/SNS, GKE/EKS, Data Flow, Cloud Spanner, etc.
• Hands-on experience with Infrastructure as Code (IaC) tools such as Terraform.
• Proficiency in programming or scripting languages such as GoLang, Python, or Bash for automation and infrastructure management.
• Experience with containerization technology and orchestration platforms such as Docker and Kubernetes.
• Experience with monitoring and logging tools such as Grafana, Prometheus, ELK stack, or equivalent.
• Familiarity with data engineering concepts and tools such as SQL, Kafka, or similar technologies.
• Solid understanding of networking concepts, security principles, and best practices for cloud environments.
• Excellent problem-solving skills and the ability to work effectively in a fast-paced, collaborative environment.
• Strong communication skills and the ability to articulate technical concepts to non-technical stakeholders.
Desirable qualifications:
• Highly proficient in GoLang or Python.
• Demonstrated experience in cloud cost monitoring and optimization strategies utilizing tools like Google Billing or AWS Cost Explorer to identify cost inefficiencies and implement cost-saving measures.
• Demonstrated experience in identifying and implementing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to ensure system reliability and performance.
• Experience with multi-cloud environments.
• Experience with geospatial data processing and analysis tools, or experience working with geospatial datasets. *
• Experience with cloud-based machine learning services and platforms such as Google Cloud VertexAI or AWS SageMaker, and experience with model training, evaluation, and deployment workflows. *
• Basic understanding of data preprocessing techniques such as feature scaling, feature engineering, and dimensionality reduction, and their application in preparing environmental data for machine learning models. *