Localize is looking for a Platform Reliability Engineer to sign up for our rising engineering group. As Localize expands, the scalability, reliability, and function of our infrastructure and programs have turn into paramount. This function is devoted to overseeing and managing all sides of Localize’s technical infrastructure, databases, device equipment, and to imposing methods for efficient tracking, alerting, and upkeep. You are going to be liable for the scalability, balance, reliability, and function of the Localize platform. This function may even beef up Devops and make stronger methods utilized by the engineering group to reinforce productiveness.
Key Obligations:
- Oversee and organize Localize’s infrastructure throughout AWS and Cloudflare.
- Be sure that the scalability, reliability, efficiency, and safety of Localize’s knowledge retail outlets, in particular Redis and MongoDB, via efficient configuration, tracking, question optimization, and backup control.
- Oversee and automate deployment procedure.
- Personal and reinforce tracking of uptime and function the use of equipment reminiscent of Bugsnag, Datadog, and New Relic.
- Determine, plan and enforce enhancements and optimizations for infrastructure referring to charge, reliability, scalability.
- Increase and take care of detailed documentation and maps of our infrastructure and dependencies.
- Actively be told new applied sciences and methods to reinforce the performance and functions of our infrastructure.
Should-Have Abilities:
- 5+ years engineering enjoy. A minimum of 2 years of enjoy in a SRE and/or Devops function.
- Experience in managing and optimizing infrastructure in AWS.
- Redis and MongoDB, together with configuration, tracking, optimization, and control of backups.
- Enjoy with guide or automatic deployment & free up control.
- Wisdom of perfect practices for securing infrastructure, together with managing get entry to controls and figuring out possible vulnerabilities.
- Abilities in assessing and optimizing efficiency of infrastructure and programs for stepped forward efficiency, reliability, scalability, and cost-efficiency.
- Skillability in command line scripting
Great-to-Have Abilities or Enjoy:
- Infrastructure as Code (IaC) equipment like Terraform or CloudFormation.
- Working out of continuing integration and steady deployment (CI/CD) processes and equipment.
- Elasticsearch configuration for complex seek functions.
- Crisis Restoration Making plans together with knowledge integrity and availability in case of emergencies.
- Enjoy with APM and logging equipment reminiscent of DataDog, New Relic, or Kibana.