Site Reliability Engineer

ГоловнаВакансіїSite Reliability Engineer
  • Cloud & DevOps vacancies
  • Віддалено
  • Гаряча вакансія

Requirements:

  • 5+ years of relevant experience in the following areas: SRE, DevOps, Cloud Operations, Systems Engineering, or Software Engineering;
  • Key skills: Linux, Networking, Docker, Kubernetes, Git, Java;
  • Strong troubleshooting, analytical, and problem-solving skills;
  • Knowledge of AWS or any other public Cloud platforms;
  • Experience with monitoring, logging & telemetry tools like Datadog, New Relic, Splunk, ELK, Nagios, Prometheus, AWS CloudWatch, etc.;
  • Flexible to adapt to a dynamic environment, make quick and sound decisions under pressure;
  • Good understanding of CI/CD processes and solutions.

Responsibilities:

  1. Engage in and improve the full lifecycle of services, from inception and design, through to deployment, operation, and refinement;
  2. Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity;
  3. Support and troubleshoot Cloud deployments and environment issues;
  4. Lead in the adoption of continuous delivery and automation of platform services;
  5. Identify and implement infrastructure resilience improvements;
  6. Implement monitoring tools and dashboards for various services;
  7. Evaluate and propose tools and techniques to improve operational activities.

Working conditions:

  • Full-time;
  • remote;
  • you’re ready to work until noon Toronto time zone.
Анна Грошева
Human resources