Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, Reese Levine, and Achilles Benetopoulos
The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).
For spring 2026, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.
Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.
| Date | Speaker | Title |
|---|---|---|
| April 3 | No Speaker | Social Hour |
| April 10 | Audrey Cheng | Rethinking Database Optimization for Modern Workloads |
| April 17 | Achilles Benetopoulos | Conference Practice Talk |
| April 24 | Elizaveta Pertseva | Automating Verification of ZKP Arithmetizations in Lean |
| May 1 | Lindsey Kuper | Can you keep a secret? A new protocol for sender-side enforcement of causal message delivery |
| May 8 | Nikos Pagonas | VineLM: Trie-Based Fine-Grained Control for Agentic Workflows |
| May 15 | Micah Murray | Designing a Datacenter-wide Distributed Shared Log |
| May 22 | Thalia Archibald | UNIX V4: History and Recovery |
| May 29 | Zheyuan Chen | SIMT-Step Execution: A Flexible Operational Semantics for GPU Subgroup Behavior |
| June 5 | Scott Kovach | TBD |
April 3
Speaker: No Speaker
Social Hour in place of talk.
April 10
Speaker: Audrey Cheng
Title: Rethinking Database Optimization for Modern Workloads
Abstract: Data systems face unprecedented scalability demands as modern applications, especially AI workloads, evolve rapidly. These shifts make it increasingly difficult to maintain both performance and correctness, which are the core properties that databases must provide. In this talk, I discuss how to rethink database optimization by exploiting workload semantics in modern large-scale applications and how we can scale these efforts by automating this optimization with AI. First, I will present my work on reducing data contention, which remains a crucial performance bottleneck, by leveraging contention patterns in modern workloads. My research addresses this challenge via transaction scheduling: instead of resolving conflicts after they occur, I focus on preventing them by reordering transactions to avoid conflicts before execution. I will then discuss how we build on these results by leveraging AI-driven methods to enable the rapid exploration and generation of optimization methods, with the broader goal of automating performance optimization in data systems.
Bio: Audrey is a PhD student at UC Berkeley, advised by Natacha Crooks and Ion Stoica. Her research focuses on performance optimization for database systems. Her work has been deployed in industry databases at Meta, PlanetScale, and TiDB. She was named a Rising Star in EECS and has received an NSF GRFP Fellowship, a Meta Research PhD Fellowship, a Berkeley Chancellor’s Fellowship, a VLDB Best Industry Paper Award, and invitations to the Best of VLDB journal.
April 17
This week we will have practice talks for upcoming conference presentations.
Achilles Benetopoulos: Yield Not Thy Core, to be presented at EuroSys.
April 24
Speaker: Elizaveta Pertseva
Title: Automating Verification of ZKP Arithmetizations in Lean
Abstract: Many modern Zero-Knowledge Proof (ZKP) systems rely on arithmetizations to encode machine level (bitvector) arithmetic as finite field operations. The correctness of these arithmetizations is crucial for soundness, but constructing them can be error-prone. Existing verification workflows are either manual, requiring substantial human effort, or rely on SMT solvers, which scale poorly on these problems. In this talk, I will discuss existing verification tools and present our novel automated Lean-based approach that uses type translation to enable more scalable and trustworthy verification of ZKP arithmetizations.
Bio: Elizaveta Pertseva is a 3rd year PhD at Stanford advised by Clark Barrett. Her research focuses on automatically formally verifying cryptographic primitives.
May 1
Speaker: Lindsey Kuper
Title: Can you keep a secret? A new protocol for sender-side enforcement of causal message delivery
Abstract: Protocols for causal message delivery are widely used in distributed systems. Traditionally, causal delivery can be enforced either on the message sender’s side or on the receiver’s side. The traditional sender-side approach avoids the message metadata overhead of the receiver-side approach, but is more conservative than necessary. We present Cykas (“Can you keep a secret?”), a new protocol for sender-side enforcement of causal delivery that sidesteps the conservativeness of the traditional sender-side approach by allowing eager sending of messages and constraining the behavior of their recipients. We implemented the Cykas protocol in Rust and checked the safety and liveness of our implementation using the Stateright implementation-level model checker. Our experiments show that for applications involving long-running jobs, Cykas has a performance advantage: Cykas lets long-running jobs start (and end) earlier, leading to shorter overall execution time compared to the traditional sender-side approach.
May 8
Speaker: Nikos Pagonas
Title: VineLM: Trie-Based Fine-Grained Control for Agentic Workflows
Abstract: Agentic workflows interleave configurable LLM stages with tool stages and often include retries or refinement loops. Existing workflow managers profile full workflow configurations offline and assign each request a static workflow-level plan that binds each configurable LLM stage to a single model, reuses that model across repeated loop iterations, and does not revisit those choices at runtime. In this talk, I will present VineLM, a workflow manager that enables fine-grained control by choosing the model for each stage invocation as execution unfolds under request-level objectives such as maximizing accuracy under cost or latency budgets. VineLM represents feasible executions as an annotated trie of model-choice prefixes and uses checkpointing and cascade profiling to estimate path accuracy, cost, and latency without exhaustively profiling every request on every path. At runtime, VineLM re-roots the trie after each stage invocation and replans over the remaining subtrie using the realized execution prefix and remaining latency budget. On NL2SQL and math reasoning workflows, VineLM improves the cost-latency-accuracy frontier over coarse workflow-level baselines, achieving up to 18% higher accuracy at the same per-request budget with its sparse profiling reducing offline profiling cost by 98–99.8% when compared to exhaustive profiling.
Bio: Nikos is a second-year PhD student in Computer Science at Columbia University and a member of DAPLab, advised by Prof. Kostis Kaffes. His research focuses on improving the performance and efficiency of agentic serving, by building systems that exploit workflow structure and runtime characteristics. In Summer 2025, Nikos was a Student Researcher at Google. Before joining Columbia, he was a member of the ATLAS research group at Brown University, advised by Prof. Nikos Vasilakis. Nikos received his Master’s degree in Electrical and Computer Engineering from the National Technical University of Athens (NTUA), where he was advised by Prof. Georgios Goumas. His research is supported by the Columbia Presidential Fellowship, the A.G. Leventis Foundation, and a Gerondelis Foundation Graduate Study Scholarship.
May 15
Speaker: Micah Murray
Title: Designing a Datacenter-wide Distributed Shared Log
Abstract: Distributed shared logs simplify the implementation and interoperation of data stores. This paper addresses a simple question: Is it feasible to build a single, datacenter-wide distributed shared log that can support all the data stores running in a datacenter? We answer in the affirmative by presenting RingWorld, a scalable log based on a ring of programmable switches that can sustain tens of billions of appends per second while maintaining low latency. We hope the design of RingWorld will propel the adoption of shared logs as a core part of datacenter infrastructure.
Bio: Micah is a 4th year PhD student coadvised by Prof. Natacha Crooks and Prof. Scott Shenker at UC Berkeley. His research interests are broadly in distributed systems and networking, and he is particularly interested in problems involving the co-design of distributed systems with networking hardware. His current focus is on building highly scalable logging infrastructure. You can see more about his current and past work at his website.
May 22
Speaker: Thalia Archibald
Title: UNIX V4: History and Recovery
Abstract: We recently recovered UNIX V4 from a 1974 magnetic tape at the University of Utah. This version of the UNIX operating system, thought to have been lost, was the 19th copy distributed to the public, just months after the first public announcement. It was originally acquired by Martin Newell while managing the computer graphics laboratory, and it was likely connected to his foundational research in procedural modeling and the famous Utah teapot. UNIX V4 was the culmination of the effort to rewrite the kernel in C, made possible by the introduction of structs to the language, and has shaped all modern operating systems. In this talk, I put this artifact into context within the larger history of UNIX and demonstrate period-appropriate software development with a paper-printing teletype and replica PDP-11.
Bio: Thalia is a first-year PhD student at the University of Utah, advised by John Regehr. Her research is in verifying compilers and applying translation validation to verify LLVM optimizations.
May 29
Speaker: Zheyuan Chen
Title: SIMT-Step Execution: A Flexible Operational Semantics for GPU Subgroup Behavior
Abstract: Modern GPUs execute programs using a SIMT model, where small groups of threads, called subgroups or warps, execute together and support high-performance communication and coordination. Many optimized kernels rely on subgroup APIs, but the guarantees behind subgroup behavior are often unclear. We present SIMT-Step, a flexible operational semantics for subgroup execution. SIMT-Step helps clarify what it means for subgroup invocations to execute “together,” and how different programming-model guarantees affect which programming patterns are portable. By comparing these guarantees with behavior observed across real GPUs, this work highlights a gap between how programmers reason about subgroups and what languages actually specify. The goal is to provide a clearer foundation for portable, high-performance GPU programming.
Bio: Zheyuan Chen is a second-year PhD student at the University of California, Santa Cruz, advised by Tyler Sorensen. His research has focused on GPU semantics and the design of portable, high-performance GPU programs.
June 5
Speaker: Scott Kovach
Title: TBD
Abstract: TBD