Operations/System Reliability Engineer
REPORTING TO: Head of Product Delivery
LOCATION: Belfast/Derry/Hybrid
THE ROLE:
This role will involve building, testing and maintaining appropriate infrastructure and tools to provide our customers with an effective, reliable service in an efficient environment.
RESPONSIBILITIES:
- Work with the Head of Product Delivery to define Operational requirements and framework
- Identify priorities based on requirements within overarching framework
- Plan and execute on priorities
- Instigate and manage risk management assessments
- Identify ‘normal’ operating procedures and optimise
- Design, build, source systems, tools and processes to meet Operational needs
- Outline and define Operational requirements for Delivery and Support
- Work with agile development processes and take ownership for aspects of Operational requirements by putting appropriate methods and tools in place
- Implement monitoring, log analysis and reporting systems associated with hardware, sites, software, performance, cost, security and user experiences.
- Troubleshoot configuration, environmental and software issues and help identify solutions
- Automation of processes
- Focus on optimising costs and productivity (focus on customer experience and performance relative to cost)
ESSENTIAL CRITERIA:
- Degree level education in a relevant discipline or equivalent experience
- 12 months experience in an Operational role or a developer role involving significant Operational considerations
- Experienced in at least one of the main cloud technologies – AWS, Azure, RedHat, GCP, IBM Cloud
- Strong working knowledge of Linux
- Strong competence in Python
- Experience of building and implementing automated pipelines including working with repos, build automation tools, build orchestration and environment automation
- Experience in implementing tools for logging, monitoring and alerting.
- Experience in creating and automating virtual machines in public and private clouds
- An understanding or experience of high availability, business continuity and disaster recovery solutions in the cloud
- Strong communication
DESIRABLE CRITERIA:
- A recognised DevOps or SysOps certification from e.g. AWS
- Experience developing custom scripts in Python, Bash, PowerShell, GoLang or similar language
- Experience implementing cloud infrastructure and networking required to host services, including storage, firewall and network configuration
- Experience in deploying serverless functions e.g. AWS Lambda
- Experience of Agile Scrum, Lean or Kanban using JIRA, or similar agile tracking tools