Title: Manager-SRE
Area(s) of responsibility
Responsibilities
- Lead the SRE team in designing, building, and maintaining the company's large-scale, complex systems.
- Collaborate with the development team to ensure system reliability, efficiency, and performance.
- Develop and implement automation strategies to improve the scalability and reliability of systems.
- Monitor system performance, troubleshoot issues, and conduct root cause analysis to prevent recurrence.
Required Skills
- Proficiency in programming languages such as Python, Java, or Go.
- Strong understanding of cloud computing platforms like AWS, Google Cloud, or Azure.
- Expertise in system design, system management, and storage systems.
- The candidate must have a Bachelor's degree in Computer Science, Information Technology, or a related field. A Master's degree or relevant certifications would be advantageous.
Preferred Skills
- Familiarity with containerization technologies like Docker and Kubernetes.
- Experience with CI/CD pipelines and tools such as Jenkins, GitLab, or CircleCI.
- Knowledge of monitoring tools like Prometheus, Grafana, or New Relic.
- Understanding of database technologies like SQL, NoSQL, or MongoDB.