Core Team · SREGym · Paper Accepted at CAIS '26

Lily Gniedziejko

Research Engineer · CS @ UIUC · AI + Systems + SRE

I build the infrastructure that teaches AI to resolve real-world production failures. Core team @ SREGym, a high-fidelity, interactive benchmark for AI Site Reliability Engineering, with a paper accepted at CAIS '26 and active adoption at Microsoft Research, Resolve AI, and the University of Washington. Supported by a Slingshot grant from the Laude Institute.

Portrait of Lily Gniedziejko

In the News

SREGym and my research have been recognized by academia and industry.

Press Siebel School · UIUC

SREGym Featured in Siebel School of Computing and Data Science News

The Siebel School highlighted SREGym for its novel approach to evaluating AI SRE agents against real production failures in live cloud environments.

Read Article ↗
Industry Adoption

Used by Microsoft Research, Resolve AI, Univ. of Washington & SRE Startups

Adopted by top researchers and commercial agentic SRE platforms, Resolve AI's production observability controller drops into a SREGym cluster with a single kubectl command, validating the framework as a drop-in evaluation harness for industry-grade AI SRE tooling.

xLab · UIUC · Core Team · Jun 2025 – Present · Supervised by Prof. Tianyin Xu

SREGym

A high-fidelity, interactive benchmark for AI Site Reliability Engineering, 90 SRE problems spanning hardware, OS, misoperation, and application-level faults across Kubernetes, TiDB, MongoDB, and Kafka. Models the complexity of production through ambient noise injection and diverse failure modes (metastable, correlated). Agents diagnose and mitigate via Prometheus, Loki, and Jaeger MCP servers.

TiDB

Misoperation Fault Mechanism

Enabled misoperation as a new fault class in SREGym by porting TiDB and building a new application on top, expanding the benchmark's fault coverage into a previously unexplored category.

TiDBFault InjectionDistributed DB
MCP

Prometheus & Jaeger MCP Tools

Built Model Context Protocol tools for the LangGraph agent, Prometheus server with PromQL querying and Jaeger trace tools, so agents can query live metrics and distributed traces during incident resolution.

MCPPrometheusJaegerLangGraph
viz

Agent Trace Visualizer

Created a tool that converts JSONL agent outputs into readable, timestamped HTML reports, making evaluation workflows faster and more legible for the entire research team.

PythonJSONLHTMLEvaluation
TUI

Bubble Tea Fault Injection TUI

Interactive terminal UI in Go using Bubble Tea for dynamic fault injection and application deployment, a keyboard-driven dashboard for triggering failure drills and managing cluster deployments.

GoBubble TeaTUIFault Injection
E2E

Distributed E2E Benchmark Runner

Automated benchmark runner distributing SREGym problems across multiple nodes, handles cluster creation, parallel tmux execution, and auto log collection. Adding nodes directly reduces per-node problem load.

PythontmuxDistributedAutomation
90 SRE Problems
4+ Orgs Using SREGym
CAIS '26 ACM Paper Accepted

Also Built

Beyond SREGym, internship work and personal projects that shipped to real users.

Mueller Water Products · May – Sep 2025

Software Engineering Intern

See Projects
  • AI maintenance chatbot processing 1,500+ technical PDFs (incl. 400+ page engineering manuals), reduces engineer downtime and links directly to exact source pages
  • Microsoft Teams bot with live SQL queries and PowerBI dashboard generation
  • Full-stack data entry application for Autopour and Melting machines
RAGAzureReactPythonPowerBI.NET
Personal Projects

FleetCast · iSwipe · YouTube Extension

See All
  • FleetCast, Real-time satellite telemetry simulator on Kubernetes + TiDB, actively used in xLab for fault injection research
  • RSO Swiper, React Native app with OpenAI embeddings + Firebase auth for UIUC students to find research labs and student orgs
  • YouTube AI Assistant, Published Chrome extension with LangChain-powered summaries and interactive quizzes
KubernetesReact NativeLangChainFirebaseOpenAI

Technical Skills

Systems & Cloud
Kubernetes Docker Linux Azure Helm
AI & ML
LangGraph LangChain RAG OpenAI API MCP
SRE & Observability
Prometheus Jaeger Fault Injection AIOps
Languages
Python JavaScript Java C++ C#
Frontend & Full-Stack
React JS/Native Firebase .NET HTML/CSS
Spoken Languages
Polish (Native) English (Native) Spanish (Prof.)

Awards & Recognition

Siebel Celebration of Excellence event photo 2025 Siebel Celebration of Excellence

Honored at the 2025 Celebration of Excellence (×2)

Recognized twice by the Siebel School of Computing & Data Science for academic excellence and research impact in the same year.

James Scholar recognition James Scholar & Scholarships

James Scholar · Dunn Family Scholarship · Engineering Visionary Scholarship

Grainger College of Engineering distinguished scholar and recipient of two competitive merit scholarships for academic performance and leadership.

Dean's List recognition Dean's List

Dean's List, 3 Consecutive Semesters

UIUC Grainger College of Engineering Dean's List for three consecutive semesters, maintaining a 3.97 GPA in Computer Science.

Salutatorian honor Salutatorian

High School Salutatorian

Graduated 2nd of 570 students at Highland Park High School. GPA: 4.66 weighted, 3.99 unweighted.

Polish School graduation honor Polish School Red Ribbon Graduate

Red Ribbon Honor Graduate

Earned the Red Ribbon academic distinction every year from middle school through high school, the highest honor at Polish School.

Education

Expected May 2028

University of Illinois Urbana-Champaign

3.97 GPA

B.S. Computer Science · Dean's List (3×) · James Scholar

Coursework: Data Structures · Computer Architecture · Discrete Structures · Linear Algebra with Computational Application · Probability & Statistics for CS · Software Engineering Lab · Systems & Networking Seminar

Graduated 2024

Highland Park High School

4.66 GPA

Salutatorian (2nd of 570 students) · Chamber of Commerce Scholarship

Programs & Certifications

NVIDIA Bridge Program

NVIDIA Summer 2025 · Selective Invite

Invited to join a selective program connecting with NVIDIA engineers and alumni, with exposure to GPU computing, AI tooling, and professional development.

Coursework at UIUC

Current and completed courses.