Site Reliability Engineer

20 May 2020

Our client is a leading AI/Machine Learning company, who are looking for strong technical SRE

* Collaborate with the team to deliver working software incrementally that provides an
elegant and seamless experience for our customers
* Establish and track Service-Level Objectives (SLO) and Service-Level Indicators (SLI)
for production and development environments
* Build and improve tools for building, deploying, monitoring and managing our systems
* Simultaneously committed to improving security, reliability, scalability, and the team's
ability to introduce new features and performance improvements
* Proactively cause production systems to fail to test automated fail-over and service
continuity capabilities
* Disaster recovery planning, automation and testing
* Diagnose and troubleshoot problems; implement changes so problems only happen
* Provide technical leadership and coaching to influence the design, development and
implementation of secure and reliable systems that delight customers
* Stay current with the latest design, web, mobile, cloud computing, machine learning
and data science technologies to best support the engineering group
* B.S. degree in Software Engineering, Computer Science or related technical field (e.g.
EE, physics or mathematics) or equivalent practical experience
* AWS Certifications, or equivalent practical experience
* 3+ years of experience as a systems administrator, DevOps engineer, SRE, or similar
technical role
* Experience deploying and maintaining cloud applications using AWS
* Experience using Linux, Unix and macOS
* Experience using PostgreSQL or another relational database
* Broad knowledge of web, mobile and server software development, data stores,
networking, security, machine learning, and cloud computing services
* Quick problem solving skills and the ability to exercise independent judgement
* Strong people, process and technical leadership abilities
* Proven track record of delivering quality work on time
* Excels when co-located with a small team using an agile workflow
* A positive, professional attitude and customer service orientation
* Great communication, collaboration and presentation skills
* Continuously learning new programming languages, frameworks, technologies and
approaches to generate innovative solutions
* A proven technical mentor who is keen to coach others and share ideas
* A passion for engineering excellence
Valuable Experience
* Supported large-scale web, mobile and server applications in production
* Depth in modern data engineering and analytics, including platforms such as Hadoop,
Kafka or Elasticsearch
* Understanding of UX design practices
* Web application development using React, Angular or Vue
* Mobile application development for iOS or Android
* API development using GraphQL or REST

Reshmi Nair's picture
Technology Recruiter |Security and DevOps, Toronto
Toronto |