- 5+ years of relevant experience in the following areas: SRE, DevOps, Cloud Operations, Systems Engineering, or Software Engineering;
- Key skills: Linux, Networking, Docker, Kubernetes, Git, Java;
- Strong troubleshooting, analytical, and problem-solving skills;
- Knowledge of AWS or any other public Cloud platforms;
- Experience with monitoring, logging & telemetry tools like Datadog, New Relic, Splunk, ELK, Nagios, Prometheus, AWS CloudWatch, etc.;
- Flexible to adapt to a dynamic environment, make quick and sound decisions under pressure;
- Good understanding of CI/CD processes and solutions.
- Engage in and improve the full lifecycle of services, from inception and design, through to deployment, operation, and refinement;
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity;
- Support and troubleshoot Cloud deployments and environment issues;
- Lead in the adoption of continuous delivery and automation of platform services;
- Identify and implement infrastructure resilience improvements;
- Implement monitoring tools and dashboards for various services;
- Evaluate and propose tools and techniques to improve operational activities.
- you’re ready to work until noon Toronto time zone.