Site Reliability Engineer Job at SDI International Corp., Chicago, IL

ZmpxUXVhbkwycE1meEJoek1NdUpJSDdjTHc9PQ==
  • SDI International Corp.
  • Chicago, IL

Job Description

Site Reliability Engineer

Description and Requirements

About Our Team

We are building Quantum , a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this initiative, we are growing the reliability engineering organization that powers cross‑device Personal AI.

We are hiring Site Reliability Engineers (SREs) to strengthen the reliability, observability, and operational excellence of Qira’s AI systems across device, edge, and cloud. Depending on your strengths, you may be aligned to areas such as Observability, Operations, or Service Reliability.

Works with the speed and creativity of a startup inside— you’ll help build foundational systems with clarity, ownership, and modern engineering practices.

Location: On-site in Chicago, IL. Hybrid (3 days on-site, 2 days remote)

What You Might Work On

As an SRE, you may be responsible for a subset of the following, depending on team placement and skill alignment:

Reliability & Systems Engineering

  • Support the reliability, availability, and performance of distributed systems across cloud, edge, and device environments.
  • Help define, measure, and monitor SLIs and SLOs for core services.
  • Identify reliability risks and collaborate with senior engineers on mitigation plans.

Operational Excellence

  • Participate in on‑call rotations and assist with incident response and post‑incident reviews.
  • Contribute improvements to runbooks, automation, and tooling that reduce alert noise and operational toil.
  • Help enhance detection, alerting, and response workflows.

Observability & Insight

  • Implement and improve telemetry using OpenTelemetry , Grafana , and related tools.
  • Build dashboards and tools that improve visibility into system health and AI service behavior.
  • Ensure observability data is complete, accurate, and actionable.

Deployments & Change Safety

  • Support safe, reliable deployment workflows including canaries, staged rollouts, and automated rollbacks.
  • Assist in improving CI/CD systems and deployment tooling.

Collaboration & Best Practices

  • Work closely with senior SREs, DevOps engineers, AI/ML teams, and platform engineers.
  • Contribute to reliability reviews, operational readiness checks, and cross‑team projects.
  • Advocate for modern SRE and DevOps practices within the organization.

Basic Qualifications

  • 4+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or production systems operations .
  • Bachelor’s Degree in Computer Science, Engineering, or related technical field (or equivalent practical experience).
  • Foundational experience supporting distributed systems in production.
  • Ability to write scripts or tools in Python, Go, Bash, or similar languages.
  • Solid understanding of Linux systems, networking basics, and system performance fundamentals.
  • Experience with cloud platforms (Azure preferred, AWS or GCP acceptable).
  • Familiarity with monitoring/observability (metrics, logs, tracing).
  • Experience with containers and Kubernetes.

Preferred Qualifications

  • Experience with OpenTelemetry instrumentation and telemetry pipelines.
  • Hands‑on experience with Grafana , Prometheus, Loki, or Tempo.
  • Exposure to AI/ML systems, inference services, or data‑intensive workloads.
  • Experience contributing to CI/CD processes and deployment automation.
  • Familiarity with hybrid architectures spanning device, edge, and cloud.
  • Passion for automation, reliability, and operational excellence.

What Success Looks Like

  • Systems become easier to operate, observe, and trust.
  • Alerts are more accurate and actionable.
  • On‑call load decreases through thoughtful automation and improvements.
  • Deployment workflows become more reliable and repeatable.
  • You grow toward deeper ownership and technical leadership within the reliability engineering organization.

Job Tags

Similar Jobs

Yale New Haven Health

anesthesia technician Job at Yale New Haven Health

 ...Assists with turnover, including the changing and cleaning of anesthesia disposables and non-disposables. 2.Provides Basic Technical...  .... EXPERIENCE Minimum of 12 months as a clinical technician, PCA, CNA,TA, pharmacy technician preferred. Operating room experience... 

EDS Air Coinditioning and Plumbing

HVAC Maintenance Technician Job at EDS Air Coinditioning and Plumbing

HVAC Maintenance Technician (Preventative Maintenance Technician) Location: Palm Beach County, FL Company: EDS Air Conditioning & Plumbing About EDS EDS Air Conditioning & Plumbing is a top-tier mechanical contractor serving Palm Beach Countys residential and... 

AMI Network

Physical Therapist Assistant Job at AMI Network

 ...Join Our Team as a Geriatric Physical Therapist Assistant Make a Real Impact Every Day! Locations...  ...equipment Lead and assist with group therapy sessions Collaborate with an...  ...maintain therapy equipment and supplies Travel locally to participant homes and... 

Nourish Farms, Inc.

Culinary Coordinator - Part-time or Full-Time Available Job at Nourish Farms, Inc.

 ...and Farmers Market Benefits become effective the 1st of the month following 30 days of employment About Nourish: Nourish Farms, Inc. (Nourish) is a 501c3 nonprofit organization dedicated to connecting people with food, agriculture, and community through hands... 

CAREonsite

Physician Assistant or Nurse Practitioner Job at CAREonsite

 ...Make a Difference with Us! Working at CAREonsite as a Physician Assistant (PA) or Nurse Practitioner (NP) is more than just a jobits an opportunity to grow professionally while maintaining a healthy work-life balance. CAREonsite, a division of Tang & Company...