Talent500

Site Reliability Engineer/Cloud Engineering Manager - Configuration Management

Job Location

Chennai, India

Job Description

Job Description : Role : Senior Engineering Manager Cloud Engineer Site Reliability Engineering for Ford Credit Tech We are passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are currently seeking a public cloud experienced engineer for planning, designing and implementing next generation cloud infrastructure solutions. Cloud Engineer will be a part of the Engineering team and will require a strong knowledge of application monitoring, infrastructure monitoring, automation, maintenance, and Service Reliability Improvements. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction. Role & Responsibilities : - Design, automate and manage a highly available and scalable cloud deployment that allows development teams to deploy and run their services. - Automate operational tasks and processes using tools like Cloud Shell, Cloud SDK, or Cloud APIs. - This includes creating scripts, templates, or workflows to automate deployment, scaling, monitoring, and maintenance activities. - Extensively automated deployments and managed applications in GCP. - Collaborating with engineering and Architects teams to evaluate and identify optimal cloud solutions, also leveraging scalability, high-performance and security. - Design the architecture of the self-hosted cloud platform, considering scalability, performance, and security requirements. - Deploy, configure, and manage the various components of the self-hosted cloud platform, including virtualization, networking, storage, and orchestration tools. - Modernize existing on-prem solutions and improve existing systems. - Extensively automated deployments and managed applications in GCP. - Developing and maintaining cloud solutions in accordance with best practices. - Ensuring efficient functioning of data storage and processing functions in accordance with company security policies and best practices in cloud security. - Collaborate with Engineering teams to identify optimization strategies, help develop self-healing capabilities. - Experience in developing a strong observability capabilities. - Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues. - Regularly reviewing existing systems and making recommendations for improvements. - Should have capability to. Required Skills and Selection Criteria : - Proven work experience in designing, deploying and operating mid to large scale public cloud environments. - Proven experience in designing, deploying, and managing self-hosted cloud platforms. - Strong scripting and automation skills (Golang Python, Bash, Terraform). - Proven work experience in Docker/Kubernetes (image building, k8s schedule). - Experience in package, config and deployment management via Helm, Kustomize, ArgoCD. - Proven work experience in provisioning Infrastructure as Code (IaC) using Terraform Enterprise or community edition. - Proven work experience in writing custom terraform providers/plug-ins with Sentinel Policy as Code. - Strong knowledge in Github, DevOps (CloudBuild is an advantage). - Should be proficient in scripting and coding, that include traditional languages like Python, GoLang,Java, JS and Node.js. - Familiarity with networking concepts and protocols (TCP/IP, DNS, DHCP). - Experience in security best practices and techniques for securing self-hosted environments. - Solid understanding of cloud infrastructure components such as compute, storage, and networking. - Extensive knowledge and hands-on experience in Grafana and Prometheus micro libraries. - Exposure to Cloud Monitoring and logging. - Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn). - Experience with automation tools should be a priority. - Professional Certification is an advantage. - Public Cloud GCP is good to have. Preferred Qualifications : - Must have 10 years of experience in infrastructure. - Must have 5 years of experience in Cloud Computing - Public cloud. - Must have 5 years of experience in DevSecOps. - Must have 3 years of experience in Virtualization and containers, IaC, Cloud hosting. (ref:hirist.tech)

Location: Chennai, IN

Posted Date: 4/24/2024

Click Here to Apply

View More Talent500 Jobs

Contact Information

Contact	Human Resources Talent500