Languages, Systems, and Data Seminar (Spring 2026)

Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, Reese Levine, and Achilles Benetopoulos

The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).

For spring 2026, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.

Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.

Date	Speaker	Title
April 3	No Speaker	Social Hour
April 10	Audrey Cheng	Rethinking Database Optimization for Modern Workloads
April 17	Achilles Benetopoulos	Conference Practice Talk
April 24	Elizaveta Pertseva	Automating Verification of ZKP Arithmetizations in Lean
May 1	Lindsey Kuper	Can you keep a secret? A new protocol for sender-side enforcement of causal message delivery
May 8	Nikos Pagonas	VineLM: Trie-Based Fine-Grained Control for Agentic Workflows
May 15	Micah Murray	Designing a Datacenter-wide Distributed Shared Log
May 22	Thalia Archibald	UNIX V4: History and Recovery
May 29	Zheyuan Chen	SIMT-Step Execution: A Flexible Operational Semantics for GPU Subgroup Behavior
June 5	Scott Kovach	Formalizing Extensible Board Games

April 3

Speaker: No Speaker

Social Hour in place of talk.

April 10

Speaker: Audrey Cheng

Title: Rethinking Database Optimization for Modern Workloads

Abstract: Data systems face unprecedented scalability demands as modern applications, especially AI workloads, evolve rapidly. These shifts make it increasingly difficult to maintain both performance and correctness, which are the core properties that databases must provide. In this talk, I discuss how to rethink database optimization by exploiting workload semantics in modern large-scale applications and how we can scale these efforts by automating this optimization with AI. First, I will present my work on reducing data contention, which remains a crucial performance bottleneck, by leveraging contention patterns in modern workloads. My research addresses this challenge via transaction scheduling: instead of resolving conflicts after they occur, I focus on preventing them by reordering transactions to avoid conflicts before execution. I will then discuss how we build on these results by leveraging AI-driven methods to enable the rapid exploration and generation of optimization methods, with the broader goal of automating performance optimization in data systems.

Bio: Audrey is a PhD student at UC Berkeley, advised by Natacha Crooks and Ion Stoica. Her research focuses on performance optimization for database systems. Her work has been deployed in industry databases at Meta, PlanetScale, and TiDB. She was named a Rising Star in EECS and has received an NSF GRFP Fellowship, a Meta Research PhD Fellowship, a Berkeley Chancellor’s Fellowship, a VLDB Best Industry Paper Award, and invitations to the Best of VLDB journal.

April 17

This week we will have practice talks for upcoming conference presentations.

Achilles Benetopoulos: Yield Not Thy Core, to be presented at EuroSys.

April 24

Speaker: Elizaveta Pertseva

Title: Automating Verification of ZKP Arithmetizations in Lean

Abstract: Many modern Zero-Knowledge Proof (ZKP) systems rely on arithmetizations to encode machine level (bitvector) arithmetic as finite field operations. The correctness of these arithmetizations is crucial for soundness, but constructing them can be error-prone. Existing verification workflows are either manual, requiring substantial human effort, or rely on SMT solvers, which scale poorly on these problems. In this talk, I will discuss existing verification tools and present our novel automated Lean-based approach that uses type translation to enable more scalable and trustworthy verification of ZKP arithmetizations.

Bio: Elizaveta Pertseva is a 3rd year PhD at Stanford advised by Clark Barrett. Her research focuses on automatically formally verifying cryptographic primitives.

May 1

Speaker: Lindsey Kuper

Title: Can you keep a secret? A new protocol for sender-side enforcement of causal message delivery

Abstract: Protocols for causal message delivery are widely used in distributed systems. Traditionally, causal delivery can be enforced either on the message sender’s side or on the receiver’s side. The traditional sender-side approach avoids the message metadata overhead of the receiver-side approach, but is more conservative than necessary. We present Cykas (“Can you keep a secret?”), a new protocol for sender-side enforcement of causal delivery that sidesteps the conservativeness of the traditional sender-side approach by allowing eager sending of messages and constraining the behavior of their recipients. We implemented the Cykas protocol in Rust and checked the safety and liveness of our implementation using the Stateright implementation-level model checker. Our experiments show that for applications involving long-running jobs, Cykas has a performance advantage: Cykas lets long-running jobs start (and end) earlier, leading to shorter overall execution time compared to the traditional sender-side approach.

May 8

Speaker: Nikos Pagonas

Title: VineLM: Trie-Based Fine-Grained Control for Agentic Workflows

Abstract: Agentic workflows interleave configurable LLM stages with tool stages and often include retries or refinement loops. Existing workflow managers profile full workflow configurations offline and assign each request a static workflow-level plan that binds each configurable LLM stage to a single model, reuses that model across repeated loop iterations, and does not revisit those choices at runtime. In this talk, I will present VineLM, a workflow manager that enables fine-grained control by choosing the model for each stage invocation as execution unfolds under request-level objectives such as maximizing accuracy under cost or latency budgets. VineLM represents feasible executions as an annotated trie of model-choice prefixes and uses checkpointing and cascade profiling to estimate path accuracy, cost, and latency without exhaustively profiling every request on every path. At runtime, VineLM re-roots the trie after each stage invocation and replans over the remaining subtrie using the realized execution prefix and remaining latency budget. On NL2SQL and math reasoning workflows, VineLM improves the cost-latency-accuracy frontier over coarse workflow-level baselines, achieving up to 18% higher accuracy at the same per-request budget with its sparse profiling reducing offline profiling cost by 98–99.8% when compared to exhaustive profiling.

Bio: Nikos is a second-year PhD student in Computer Science at Columbia University and a member of DAPLab, advised by Prof. Kostis Kaffes. His research focuses on improving the performance and efficiency of agentic serving, by building systems that exploit workflow structure and runtime characteristics. In Summer 2025, Nikos was a Student Researcher at Google. Before joining Columbia, he was a member of the ATLAS research group at Brown University, advised by Prof. Nikos Vasilakis. Nikos received his Master’s degree in Electrical and Computer Engineering from the National Technical University of Athens (NTUA), where he was advised by Prof. Georgios Goumas. His research is supported by the Columbia Presidential Fellowship, the A.G. Leventis Foundation, and a Gerondelis Foundation Graduate Study Scholarship.

May 15

Speaker: Micah Murray

Title: Designing a Datacenter-wide Distributed Shared Log

Abstract: Distributed shared logs simplify the implementation and interoperation of data stores. This paper addresses a simple question: Is it feasible to build a single, datacenter-wide distributed shared log that can support all the data stores running in a datacenter? We answer in the affirmative by presenting RingWorld, a scalable log based on a ring of programmable switches that can sustain tens of billions of appends per second while maintaining low latency. We hope the design of RingWorld will propel the adoption of shared logs as a core part of datacenter infrastructure.

Bio: Micah is a 4th year PhD student coadvised by Prof. Natacha Crooks and Prof. Scott Shenker at UC Berkeley. His research interests are broadly in distributed systems and networking, and he is particularly interested in problems involving the co-design of distributed systems with networking hardware. His current focus is on building highly scalable logging infrastructure. You can see more about his current and past work at his website.

May 22

Speaker: Thalia Archibald

Title: UNIX V4: History and Recovery

Abstract: We recently recovered UNIX V4 from a 1974 magnetic tape at the University of Utah. This version of the UNIX operating system, thought to have been lost, was the 19th copy distributed to the public, just months after the first public announcement. It was originally acquired by Martin Newell while managing the computer graphics laboratory, and it was likely connected to his foundational research in procedural modeling and the famous Utah teapot. UNIX V4 was the culmination of the effort to rewrite the kernel in C, made possible by the introduction of structs to the language, and has shaped all modern operating systems. In this talk, I put this artifact into context within the larger history of UNIX and demonstrate period-appropriate software development with a paper-printing teletype and replica PDP-11.

Bio: Thalia is a first-year PhD student at the University of Utah, advised by John Regehr. Her research is in verifying compilers and applying translation validation to verify LLVM optimizations.

May 29

Speaker: Zheyuan Chen

Title: SIMT-Step Execution: A Flexible Operational Semantics for GPU Subgroup Behavior

Abstract: Modern GPUs execute programs using a SIMT model, where small groups of threads, called subgroups or warps, execute together and support high-performance communication and coordination. Many optimized kernels rely on subgroup APIs, but the guarantees behind subgroup behavior are often unclear. We present SIMT-Step, a flexible operational semantics for subgroup execution. SIMT-Step helps clarify what it means for subgroup invocations to execute “together,” and how different programming-model guarantees affect which programming patterns are portable. By comparing these guarantees with behavior observed across real GPUs, this work highlights a gap between how programmers reason about subgroups and what languages actually specify. The goal is to provide a clearer foundation for portable, high-performance GPU programming.

Bio: Zheyuan Chen is a second-year PhD student at the University of California, Santa Cruz, advised by Tyler Sorensen. His research has focused on GPU semantics and the design of portable, high-performance GPU programs.

June 5

Speaker: Scott Kovach

Title: Formalizing Extensible Board Games

Abstract: Modern strategic board games resemble software: they are formal systems that require modular design so that expansions can be composed without conflict. Formalizing them as software is valuable but difficult to do with traditional languages because of the unruly ways that rules typically modify control flow; whereas software designers desire predictability and correctness, game designers craft surprising emergent behavior by combining simple elements that often augment or override previous ones.

This talk will present a new language, Turn, that mixes logical and imperative programming styles by reducing both to monotone logic program evaluation. Its key idea is to equip each logical fact with an interval of time, restricting which other facts it can interact with. Such facts can be uniformly used to encode data, (time-varying) relationships, game phases, and pending choices to be made by players. This strategy enables rule sets that match the style and conciseness of rules written in natural language.

Bio: Scott Kovach is a seventh year PhD student at Stanford advised by Fred Kjolstad. He is interested in building more interactive and learnable programming systems with the help of declarative languages and query optimization. Currently, he is working on Turn, a language inspired by extensible card games and situation semantics. Previously, he developed indexed streams, an algebraic method of compiling relational iterators for sparse tensor arithmetic and worst-case optimal join queries.