Languages, Systems, and Data Seminar (Spring 2025)

Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, Reese Levine, and Gan Shen

The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).

For spring 2025, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.

Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.

Date	Speaker	Title
April 4	Shadaj Laddad	Hydro: Modular, Correct, and Performant Distributed Systems
April 11	Eva Graversen	A Theory of Choreographic Programming
April 18	Huaifeng Zhang	The Hidden Bloat in Machine Learning Systems
April 25	Oliver Kennedy	Draupnir: A toolkit for scalable, easy-to-write declarative compilers
May 2	TBD
May 9	Patrick Redmond	Formally-verified chimeric consensus
May 16	Amanda Xu	Synthesizing Quantum-Circuit Optimizers
May 23	Abtin Molavi	Qubit Mapping and Routing for an Evolving Quantum Hardware Landscape
May 30	Achilles Benetopoulos	The world changes constantly; so should your systems
June 6	TBD

April 4

Speaker: Shadaj Laddad

Title: Hydro: Modular, Correct, and Performant Distributed Systems

Abstract: We’ve been stuck with programming models for distributed systems that force developers to individually program a fleet of sequential programs and manually reason about global correctness, or leverage frameworks that lock the developer into global scaling and fault tolerance decisions. Choreographic programming gives us a step in the right direction, by letting developers write distributed programs while avoiding opaque network boundaries. But it has significant limitations when it comes to reasoning about concurrent executions. In this talk, I’ll present the key ideas behind Hydro, a Rust framework for general-purpose distributed systems. I’ll show how Hydro leverages stream-choreographic programming to offer low-level control with strong correctness guarantees.

Bio: Shadaj is a final-year PhD student co-advised by Alvin Cheung and Joe Hellerstein in the Sky Computing Lab at UC Berkeley. He designs programming paradigms that make distributed systems modular, correct, accessible, and performant. His research spans foundational language design, program optimization, and formal methods. Shadaj is a co-organizer of the SF Systems Club, the largest systems-oriented meetup in the Bay Area, and loves to play the sitar in his spare time.

April 11

Speaker: Eva Graversen

Title: A Theory of Choreographic Programming

Abstract: Choreographic programming promises a simple approach to the coding of concurrent and distributed systems: write the collective communication behavior of a system of processes as a choreography, and then the programs for these processes are automatically compiled by a provably-correct procedure known as endpoint projection. While this promise prompted substantial research, a theory that can deal with realistic communication failures in a distributed network remains elusive.

In this talk, I will provide a theory of choreographic programming that addresses realistic communication failures taken from the literature of distributed systems: processes can send or receive fewer messages than they should (send and receive omission), and the network can fail at transporting messages (omission failure). This theory supports the programming of strategies for failure recovery, and a novel static analysis (called robustness) to check for delivery guarantees (at-most-once and exactly-once).

A key technical innovation is a deconstruction of the usual communication primitive in choreographies to allow for independent implementations of the send and receive actions of a communication, while still retaining the static guarantee that these actions will correlate correctly (the essence of choreographic programming). This has two main benefits. First, each side of a communication can adopt its own failure recovery strategy, as in realistic protocols. Second, initiating new communications does not require any (unrealistic) synchronization over unreliable channels: senders and receivers agree by construction on how each message should be identified.

Bio: Eva Graversen is a Postdoc at Tallinn University of Technology working on theoretical foundations of choreographic programming. Her PhD was under Iain Phillips and Nobuko Yoshida at Imperial College London, working on concurrent reversible computing.

April 18

Speaker: Huaifeng Zhang

Title: The Hidden Bloat in Machine Learning Systems

Abstract: In this talk, we explore the problem of bloat in machine learning (ML) systems and present key findings from our study. We begin by analyzing bloat in ML containers, which are widely used in modern cloud computing environments. Next, we examine code bloat in shared libraries of popular ML frameworks such as TensorFlow and PyTorch. We introduce our approaches for debloating both ML containers and shared libraries. Our study reveals that both ML containers and shared libraries include a substantial amount of unnecessary files and code, which bloat their size and lead to performance degradation, resource waste, and increased security vulnerabilities.

Bio: Huaifeng Zhang is a Ph.D. student in the Department of Computer Science at Chalmers University of Technology, advised by Prof. Ahmed Ali-Eldin Hassan. His research interests include software engineering, machine learning systems, cloud computing and blockchain. He also held positions at ByteDance and Google.

April 25

Speaker: Oliver Kennedy

Title: Draupnir: A toolkit for scalable, easy-to-write declarative compilers

Abstract: Compilers and program analysis tools are defined by sets of rules, or productions, that govern how code is translated, optimized, or analyzed. Production rules are compositional, allowing compiler authors to subdivide a global objective into smaller-scale local logical units that are easier to reason about. For example, a compiler author writing an optimizer might specify a rule that identifies if-then-else statements, where the condition is trivially tautological, and replaces it withthe body of the then clause. When compiling (resp., optimizing, analyzing) a block of code, the compiler (or optimizer, or program analysis tool) iteratively selects an appropriate production rule, applies it, and repeats.

The simplicity of local reasoning with production rules comes at the cost of redundant computation. To improve compile times, many production compilers adopt non-idiomatic design patterns, such as merging production rules or tracking non-local properties manually. While necessary for performance, such anti-patterns increase code complexity and make the code harder to maintain. In this talk, I will introduce Draupnir, a new compiler implementation framework based around a database engine. Draupnir allows compiler authors to continue expressing compiler logic through normal production rules, but approaches compilation, optimization, and analysis as forms of query processing. This shift in perspective allows Draupnir to leverage techniques from the database community, including Incremental View Maintenance (SIGMOD 2020), multi-query optimization (in-progress), optimization for the memory hierarchy (in-progress), and distribution (in-progress) to achieve orders of magnitude improvement in compile times, while simultaneously simplifying the logic of the compiler.

Bio: Oliver Kennedy is an associate professor at the University at Buffalo. He earned his PhD from Cornell University in 2011 and now leads the Online Data Interactions (ODIn) lab, which operates at the intersection of databases and programming languages. Oliver is the recipient of an NSF CAREER award, an IEEE Region 1 Technological Innovation Award, UB’s Exceptional Scholar Award, and several UB SEAS teaching awards. Oliver is also one of the founding board members of Breadcrumb Analytics. Several of Oliver’s papers have been invited to “Best of” compilations from SIGMOD and VLDB. The ODIn lab is currently exploring (i) how we can leverage database techniques like incremental view maintenance to make compilers faster, (ii) how to make it easier for data scientists to track how sources of uncertainty, ambiguity, and/or bias affect analyses, and (iii) how to streamline the interfaces — both human and software — between different tools for data science, like python, sql, and spreadsheets.

May 2

Speaker: TBD

Title: TBD

Abstract: TBD

Bio: TBD

May 9

Speaker: Patrick Redmond

Title: Formally-verified chimeric consensus

Abstract: Distributed Consensus is the problem of reaching agreement among of set of networked nodes. For example, it could be reaching agreement about the next transition, among the many that are pending, to apply in a replicated database. Traditionally this problem was formulated among a fixed set of participants for which a minority may fail by crashing. If we extend the setting to include malicious participants, the problem may be called Byzantine Fault-Tolerant (BFT) Consensus. If we extended yet again to open systems, as opposed to systems with a globally known set of participants, it may now be called Permissionless Consensus. Solutions to consensus problems may be factored into a voting algorithm and a quorum system, i.e. a definition of what constitutes a set of votes sufficient to trigger a consensus decision. By decomposing the problem along these two axes, we are able to obtain a formally verified “chimeric” consensus implementation that uses the voting algorithm of Paxos with the permissionless quorum system of Stellar Consensus Protocol.

Bio: Patrick is a PhD student in the LSD Lab at UCSC advised by Lindsey Kuper. Patrick’s research explores the design, implementation, and verification of distributed and concurrent systems, with a PL-angle. Patrick has several years of industry experience, and hopes to help spread the use of software verification techniques there.

May 16

Speaker: Amanda Xu

Title: Synthesizing Quantum-Circuit Optimizers

Abstract: Recent breakthroughs in quantum computing hardware have brought us closer to realizing the dream of quantum computers accelerating domains such as materials discovery. However, optimizing the programs we run on these error-prone devices is a critical software obstacle to overcome, which will remain even when we have achieved error correction at scale. As an additional roadblock, quantum hardware is diverse and constantly in flux as scientists run new experiments. To avoid the need for tedious manual updates to the optimizer with every hardware modification, my work aims to automatically synthesize a quantum-circuit optimizer for a given device.

In this talk, I will first present QUESO, an efficient approach for synthesizing correct rewrite rules given the set of operations supported by a particular platform. QUESO is powered by an algebraic view of circuit semantics. Then I will discuss a follow-up work that introduced a framework to unify these fast rewrite rules with slow unitary synthesis — two previously disparate ideas for optimizing quantum circuits. This work also presented a radically simple algorithm, GUOQ, for scheduling these optimizations in a way that exploits the synergies of rewriting and resynthesis. I will show how these automated approaches for discovering transformations and a good schedule for applying them outperform state-of-the-art hand-crafted alternatives.

Bio: Amanda Xu is a 4th year PhD student at the University of Wisconsin-Madison, advised by Aws Albarghouthi. Her research pushes the boundaries of quantum-circuit optimization using techniques from PL and formal methods. You can learn more and contact her at https://amandashoe.github.io/.

May 23

Speaker: Abtin Molavi

Title: Qubit Mapping and Routing for an Evolving Quantum Hardware Landscape

Abstract: As Amanda discussed last week, quantum computing promises to unlock breakthroughs in important domains like materials science and chemistry. However, to realize this potential in practice, we need optimizing quantum circuit compilers which can adapt to a rapidly evolving landscape of quantum hardware.

In this talk, I’ll elaborate on our efforts towards efficient and flexible quantum compilers by focusing on a compiler stage called qubit mapping and routing (QMR). The goal of QMR is to find a mapping from circuit qubits to physical locations on a target quantum device and plan the routing of multi-qubit gates in a way that is compliant with certain connectivity constraints.

We’ll begin by considering the qubit mapping and routing problem for near-term quantum computers without error correction. Using a reduction to MaxSAT and a “circuit slicing” technique, we find better solutions than existing heuristic-based approaches while scaling further than existing constraint-based approaches. Then, we turn to the qubit mapping and routing problem for fault-tolerant quantum devices. We solve this problem by exploiting the dependency structure of circuit operations to formulate discrete optimization problems that can be solved near-optimally with simulated annealing. Finally, I’ll close by discussing our ongoing work towards a unified framework for QMR problems, including these two examples among several. By extracting the common core structure of QMR and isolating the problem-specific constraints, we enable automatic generation of the QMR compiler phase for a particular class of quantum computer from a concise description.

Bio: Abtin Molavi is a PhD candidate at the University of Wisconsin-Madison, advised by Aws Albarghouthi. In his research, he tackles problems that emerge in compiling quantum circuits to physical hardware by drawing from tools and techniques in PL and formal methods.

May 30

Speaker: Achilles Benetopoulos

Title: The world changes constantly; so should your systems

Abstract: Programming distributed systems has never been easier, but making them perform well and efficiently exploit available resources remains stubbornly difficult. Programs that are incredibly fast on a single host slow to a crawl when given more resources, while systems designed for scale squander resources on modest problems. In this talk, I will present magpie, a programming model and framework that builds on a principle of extreme adaptivity of data and compute placement, enabling fast code to remain fast when distributed while providing high-performance, data-intensive applications with richer correctness guarantees than state-of-the-art systems.

Bio: Achilles is a fourth year PhD candidate at UCSC, working at the intersection of distributed systems, databases, and programming languages with Peter Alvaro.

June 6

Speaker: TBD

Title: TBD

Abstract: TBD

Bio: TBD