Senior Site Reliability Engineer

Prepared

Prepared

Software Engineering
New York, NY, USA
Posted on Monday, March 18, 2024
Emergency centers, vital to our society, are often constrained by technology that dates back to the landline era. This presents a stark contrast to our current smartphone-centric, socially connected world. At Prepared 911, we bridge this gap with cutting-edge technology that harnesses the power of Artificial Intelligence to revolutionize how emergency calls are handled. Our innovative technology is providing a suite of tools that significantly boost the capabilities of 911 dispatch centers and first responders. With our solutions implemented in over 800 cities across 48 states, we're positively impacting the lives of approximately 75 million people.

Fueled by a recent $16 million Series A funding from Andreessen Horowitz and support from Gradient (Google’s AI Fund), we're accelerating our growth to become an integral part of every emergency call, reshaping global emergency response for a safer, more responsive world.

Joining the Prepared 911 team means more than just a new role. It's a chance to be at the forefront of impact tech that significantly improves public safety and touches lives across the globe. Your work here will have a direct, meaningful impact, contributing to the technological evolution of emergency services. At Prepared 911, you're not just part of a team; you're a key player in a larger mission to foster a safer, more interconnected world.

Senior Site Reliability Engineer:

Our ideal Site Reliability Engineer is skilled in designing and implementing resilient, user-centric systems, focusing on simplicity and reliability. You bring a strong work ethic with a bias towards proactive solutions and tangible outcomes. As a valuable member of our engineering team, you contribute positively to our culture. With technical expertise in areas like cloud platforms, scripting (Python, Bash), and orchestration layers (Kubernetes), you excel in communication, effectively collaborate across teams, and possess a strategic mindset, emphasizing system sustainability and optimization. You are adept at context-switching, maintaining system performance, and driving continuous improvement in our infrastructure.

What you’ll do:

  • Design and Implement Systems: Architect and develop production-grade systems optimizing for resiliency and simplicity, with a focus on user accessibility and system reliability.
  • Automation and Tooling: Establish standards for deterministic automation. Develop tools to streamline operations, reduce manual intervention, and manage infrastructure using code.
  • System Optimization and Reliability: Monitor and optimize system performance and capacity. Lead incident management, including response, analysis, and long-term solution implementation.
  • Collaboration and Leadership: Collaborate with cross-functional teams on engineering efforts from requirements to production. Mentor junior engineers and communicate complex technical concepts clearly to different audiences.
  • Continuous Improvement: Continuously improve the on-call experience and system sustainability. Standardize and implement monitoring, logging, alerting, and SLO reporting.
  • Project Management: Independently drive projects to completion, coordinating with key stakeholders. Ensure compliance with all applicable laws and regulations.
  • Strategic Influence: Influence the technical direction of the company, driving service architecture and prioritizing technical and product roadmaps.

About you:

  • Experience: 5+ years of relevant work experience in DevOps, Site Reliability Engineering, or Platform Engineering, including experience with cloud platforms (AWS, GCP, Azure) and orchestration layers (Kubernetes).
  • Technical Expertise: Proficiency in scripting languages (e.g., Python, Bash), Linux administration, and modern programming languages (e.g., Ruby, GoLang, Python).
  • Problem-Solving and Systems Thinking: Strong work ethic, resilience, and a systematic approach to complex systems and problem-solving.
  • Communication and Leadership: Effective communication skills, ability to simplify complex topics, and proven leadership in technical domains.
  • Education: Bachelor’s degree in Computer Science or equivalent experience. Advanced degrees or certifications are advantageous.
  • Automation and Orchestration: Experience with Monitoring (Datadog), Infrastructure as Code (e.g., Terraform, Ansible) and CI/CD technologies.

Bonus points for:

  • Strong mentorship skills and a desire to build fault-tolerant, scalable software systems.
  • Experience in large-scale system design and operation.
  • A data-driven analytical approach to solving complex challenges.

Pay Transparency:

The base pay for this role is $140,000–$180,000 per year. You are also eligible for employee benefits, company equity grants, participation in Prepared’s unlimited vacation program and free membership to One Medical.