mycareers logo


Showing: 4645  jobs
Site Reliability Engineer
Spectraforce
Seattle, Washington

an hour ago

Job Description

Job Title: Sr. Systems Reliability Engineer
Location: Seattle, WA
Duration: 12 Months CTH

 
Key Responsibilities:
  • Contribute to the SRE strategy and establish best practices for release management, automation, and system reliability.
  • Mentor and guide SRE, Engineering, and Product teams in adopting core SRE principles such as service ownership, reducing toil, and continuous improvement.
  • Lead initiatives across SLIs/SLOs, observability, incident management, and postmortem practices, ensuring insights and learnings are captured and acted upon.
  • Champion SRE practices by implementing repeatable templates for logging, monitoring, and alerting frameworks.
  • Drive observability and monitoring excellence using tools such as Grafana, AppDynamics (AppD), and Sumo Logic, ensuring proactive detection and resolution of issues.
  • Partner with engineering to design reliable, fault-tolerant systems and reduce operational toil through automation.
  • Implement and leverage the Ansible Automation Platform to help teams automate infrastructure provisioning, configuration management, and event-driven workflows.
  • Enable teams to automate operational events and infrastructure changes, reducing manual intervention and improving system resilience.
  • Exercise sound judgment to ensure operational compliance with security, privacy, audit, disaster recovery, and other company requirements.
 
Job-Specific Skills, Experience & Education
  • Minimum of 5 years of experience in Site Reliability Engineering, IT operations, or related fields.
  • Bachelor’s degree in computer science, engineering, or equivalent experience (2 additional years in lieu of degree).
  • Technical expertise in system reliability, scalability, application design, and performance.
  • Hands-on experience with observability and monitoring tools such as Grafana, AppDynamics, and Sumo Logic.
  • Experience with automation platforms, particularly Ansible, for infrastructure and event-driven automation.
  • Proven ability to mentor and guide engineers in adopting SRE practices and principles.
  • Excellent communication and collaboration skills across diverse teams and vendors.
  • Strong judgment and problem-solving capabilities.
  • Experience working in multi-cloud environments.
  • Strong interpersonal, organizational, communication, and customer service skills. 
Preferred
  • Experience applying ITIL, SRE and IT process best practices.
  • Experience in tracking major incidents, rollbacks, and hotfixes; leading root cause analysis (RCA) processes; and ensuring resolution and completion of action items.
  • Experience with technical engineering in IT operations.
 
Applicant Notices & Disclaimers
  • For information on benefits, equal opportunity employment, and location-specific applicant notices, click here
 
At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position’s starting pay is: $ 55.00/hr.

Don't miss your next Big Opportunity!

Get notified when we find an opportunity for you