MyCareers Job Search - Your Gateway to Career Excellence

Site Reliability Engineer

Spectraforce

Seattle, Washington

an hour ago

Job Description

Job Title: Sr. Systems Reliability Engineer
Location: Seattle, WA
Duration: 12 Months CTH

Key Responsibilities:

Contribute to the SRE strategy and establish best practices for release management, automation, and system reliability.
Mentor and guide SRE, Engineering, and Product teams in adopting core SRE principles such as service ownership, reducing toil, and continuous improvement.
Lead initiatives across SLIs/SLOs, observability, incident management, and postmortem practices, ensuring insights and learnings are captured and acted upon.
Champion SRE practices by implementing repeatable templates for logging, monitoring, and alerting frameworks.
Drive observability and monitoring excellence using tools such as Grafana, AppDynamics (AppD), and Sumo Logic, ensuring proactive detection and resolution of issues.
Partner with engineering to design reliable, fault-tolerant systems and reduce operational toil through automation.
Implement and leverage the Ansible Automation Platform to help teams automate infrastructure provisioning, configuration management, and event-driven workflows.
Enable teams to automate operational events and infrastructure changes, reducing manual intervention and improving system resilience.
Exercise sound judgment to ensure operational compliance with security, privacy, audit, disaster recovery, and other company requirements.

Job-Specific Skills, Experience & Education

Minimum of 5 years of experience in Site Reliability Engineering, IT operations, or related fields.
Bachelor’s degree in computer science, engineering, or equivalent experience (2 additional years in lieu of degree).
Technical expertise in system reliability, scalability, application design, and performance.
Hands-on experience with observability and monitoring tools such as Grafana, AppDynamics, and Sumo Logic.
Experience with automation platforms, particularly Ansible, for infrastructure and event-driven automation.
Proven ability to mentor and guide engineers in adopting SRE practices and principles.
Excellent communication and collaboration skills across diverse teams and vendors.
Strong judgment and problem-solving capabilities.
Experience working in multi-cloud environments.
Strong interpersonal, organizational, communication, and customer service skills.

Preferred

Experience applying ITIL, SRE and IT process best practices.
Experience in tracking major incidents, rollbacks, and hotfixes; leading root cause analysis (RCA) processes; and ensuring resolution and completion of action items.
Experience with technical engineering in IT operations.

Applicant Notices & Disclaimers

For information on benefits, equal opportunity employment, and location-specific applicant notices, click here

At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position’s starting pay is: $ 55.00/hr.

Project Coordinator IV

Site Reliability Engineer

Technical Program Manager III

Employee Success Specialist

Registered Nurse (RN)

UX Writer III

Java Developer

Instructional Designer – Mid-Level / Learning & Development

Technical Lab - Histotechnologist I

Project Manager

Job Description

Don't miss your next Big Opportunity!