Site Reliability Engineer/ System Administrator at ENGIE

Kenya
Permanent
Full-time

2 months ago

ENGIE is a leading world group that provides low-carbon energy. Our group is a global reference in low-carbon energy and services.Site Reliability Engineer/ System AdministratorJob Purpose/Mission

We are seeking a talented and experienced System Administrator/Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services. You will collaborate with cross-functional teams to implement and maintain robust infrastructure solutions, focusing on automation, monitoring, and incident response. The ideal candidate is passionate about optimizing and enhancing system reliability, possesses strong problem-solving skills, and is committed to driving excellence in operational practices.

Responsibilities

Infrastructure Automation:
Develop and maintain automation tools and scripts for provisioning, configuration, and deployment.
Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility.
Monitoring and Incident Response:
Set up and maintain monitoring systems to detect and respond to performance issues and outages.
Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence.
Performance Optimization:
Optimize system performance through continuous analysis and tuning.
Reliability Engineering:
Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems.
Work towards minimizing manual intervention through automation.
System Administration:
Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems.
Implement and maintain security measures, such as firewalls and intrusion detection systems.
Perform regular system backups and recovery procedures.
Collaboration and Communication:
Collaborate with cross-functional teams to align infrastructure and operational requirements.
Provide technical guidance and support to colleagues in areas related to reliability.

Qualifications:

Bachelor’s degree in Computer Science, Information Technology, or a related field.
Proven experience as a Site Reliability Engineer or System Administrator.
Strong Linux and Bash scripting skills.
Proficiency in cloud platforms (e.g., AWS, Azure, GCP, Linode, DigitalOcean).
Experience with container orchestration tools (e.g., Kubernetes, Docker, LXD).
In-depth knowledge of networking, security, and system administration.
Familiarity with infrastructure as code tools (e.g., Terraform, Ansible).
Excellent problem-solving and troubleshooting skills.
Strong communication and collaboration skills.

Preferred Qualifications:

Experience with CI/CD pipelines and related tools.
Knowledge of distributed systems and microservices architecture.
Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack).
Familiarity with programming languages (e.g., Python, Ruby).

Don't Keep Share!:

Jobs in Kenya

Apply Now