Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, Reese Levine, and Achilles Benetopoulos
The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).
For fall 2025, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.
Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.
| Date | Speaker | Title |
|---|---|---|
| Sept 26 | NA | NA |
| Oct 3 | Jessica Dagostini, Yanwen Xu, and Patrick Redmond | Conference Practice Talks |
| Oct 10 | Reese Levine and Nathan Liittschwager | Conference Practice Talks |
| Oct 17 (Cancelled) | NA | NA |
| Oct 24 | Tom Lyon | NFS Must Die! (and how to get Beyond File Sharing in the Cloud) |
| Oct 31 | Mingwei Zheng | Semantic Bug Detection for Reliable Network Protocol Implementations |
| Nov 7 | Tommy McMichen | Representing Data Collections for Analysis and Transformation |
| Nov 14 | Lasse Moldrup | AWDIT: An Optimal Weak Database Isolation Tester |
| Nov 21 | Eric Chan | TBD |
| Nov 28 | NA | NA |
| Dec 5 | Jayaprabhakar Kadarkarai | TBD |
Sept. 26
Social Hour!
Oct. 3
This week we will have practice talks for upcoming conference presentations.
Jessica Dagostini: miniGiraffe: A Pangenomic Mapping Proxy App, to appear at IISWC 2025
Yanwen Xu: BetterTogether: A Interference-Aware Framework for Fine-grained Software Pipelining on Heterogeneous SoCs, to appear at IISWC 2025.
Patrick Redmond: Exploring the Theory and Practice of Concurrency in the Entity-Component-System Pattern, to appear at OOPSLA 2025
Oct. 10
This week we will have practice talks for upcoming conference presentations.
Reese Levine: SafeRace: Assessing and Addressing WebGPU Memory Safety in the Presence of Data Races, to be presented at OOPSLA
Nathan Liittschwager: CRDT Emulation, Simulation, and Representation Independence, to be presented at ICFP.
Oct. 17
Seminar cancelled because of ICFP/OOPSLA and SOSP.
Oct. 24
Speaker: Tom Lyon
Title: NFS Must Die! (and how to get Beyond File Sharing in the Cloud)
Abstract: One of the most important lessons learned in distributed computing and concurrency is that shared mutable data is a bad idea . What is the purpose of a network file system? – to provide a shared mutable data space . There are many other problems with the NFS model at cloud scale. NFS remains popular because its killer feature is access to large data sets, by network-unaware applications, without having to first copy them. Using existing file systems, OverlayFS , and NVMe-Over-Fabrics , we propose a new approach to achieve blazing-fast, highly scalable, and consistent access to dynamic data sets. We solicit collaborators.
Bio: Tom Lyon is a mostly retired computing systems architect, serial entrepreneur and UNIX Greybeard. His most recent startup was DriveScale, which created a disaggregated server management system, and was sold to Twitter in 2021. Prior to DriveScale, Tom was founder and Chief Scientist of Nuova Systems, a start-up that led a new architectural approach to systems and networking. Nuova was acquired in 2008 by Cisco, whose highly successful UCS servers and Nexus switches are based on Nuova’s technology. He was also founder and CTO of two other technology companies. Netillion, Inc. was an early promoter of memory-over-network technology. The Netillion team moved to Nuova Systems. At Ipsilon Networks, Tom invented IP Switching. Ipsilon was acquired by Nokia and provided IP routing and security technology for many operator and enterprise networks. As employee #8 at Sun Microsystems he contributed to the UNIX kernel, led many networking and storage projects, and was one of the NFS and SPARC architects. He started his Silicon Valley career at Amdahl Corp., where he was a software architect responsible for creating Amdahl’s UNIX for mainframes technology. Tom holds numerous US patents in system interconnects, memory systems, and storage. He received a BS in Electrical Engineering and Computer Science from Princeton University.
Oct. 31
Speaker: Mingwei Zheng
Title: Semantic Bug Detection for Reliable Network Protocol Implementations
Abstract: Countless devices around the world communicate through network protocols, forming the backbone of modern digital infrastructure. Ensuring the security and correctness of these protocol implementations is critical, as flaws can lead to service disruptions, security vulnerabilities, and data loss. While extensive research has focused on low-level reliability through techniques such as fuzzing and traditional program analysis, true robustness also depends on high-level semantic conformance to the behaviors prescribed by natural language protocol standards. This latter aspect, semantic correctness, remains under-explored in the domain of network protocol testing.
In this talk, I will present three complementary efforts aimed at detecting semantic bugs in network protocol implementations. First, I will introduce ParDiff, a static differential analysis framework that identifies silent parser bugs by comparing multiple independent implementations of the same protocol. ParDiff automatically extracts finite state machines (FSMs) from programs to model protocol message formats and employs bisimulation and SMT-based reasoning to reveal fine-grained semantic discrepancies. Second, I will discuss ParCleanse, which leverages advances in large language models to automatically extract message formats from RFCs and generate both positive and negative test cases to evaluate parser correctness. Finally, I will present RFCAudit, an LLM agent designed to align RFC documents with source code to detect functional bugs beyond parsers. RFCAudit integrates an indexing agent that performs semantic indexing of source code, and a detection agent that conducts retrieval-guided consistency checking to uncover specification violations. Across these efforts, our research has uncovered over 100 semantic bugs in widely used network protocol implementations, demonstrating the promise of semantic bug detection for building secure and trustworthy network software.
Bio: Mingwei Zheng is a Ph.D. candidate in Computer Science at Purdue University, advised by Prof. Xiangyu Zhang. Her research lies at the intersection of large language models (LLMs) and software engineering. She focuses on building efficient and effective LLM agents for automated software development tasks such as code generation, software testing, and program repair, with the broader goal of improving software correctness, robustness, and trustworthiness. Her work has been published in top-tier conferences including OOPSLA, ASE, ISSTA, S&P, CCS, and NeurIPS, and has been recognized with the ACM SIGPLAN Distinguished Paper Award (OOPSLA 2024) and a NeurIPS 2025 Spotlight. She has completed two research internships at Microsoft Research (RiSE Group) and is currently an Applied Science Intern at AWS AGI.
Nov. 7
Speaker: Tommy McMichen
Title: Representing Data Collections for Analysis and Transformation
Abstract: Compiler research and development has treated computation as the primary driver of performance improvements in C/C++ programs, leaving memory optimizations as a secondary consideration. Developers are currently handed the arduous task of describing both the semantics and layout of their data in memory, prematurely lowering high-level data collections to a low-level view of memory for the compiler. This forces an early commitment to low-level memory representations that obscures high-level structure and blocks memory layout optimizations.
In this talk I will describe MEMOIR: an SSA intermediate representation with data collections as a first-class citizen. At its core, MEMOIR decouples the memory used to store data from the memory used to logically organize it. Through its SSA form, MEMOIR enables static analysis on collection elements and allows us to generalize traditional analyses and transformations to operate on these elements. Furthermore, preserving these high-level abstractions in the compiler allows us to automate memory optimizations that must be performed manually today.
Bio: Tommy McMichen is a final-year Ph.D. student at Northwestern University, advised by Simone Campanoni, where he created and leads the MEMOIR project: a compiler intermediate representation with data collections as first-class citizens. Tommy’s research focuses on developing language-agnostic intermediate representations that retain high-level semantic information to enable more precise static analysis and unlock automatic optimizations on data organization and representation. His work aims to bridge the gap between high-level programming abstractions and low-level performance optimization through automated compiler techniques.
Nov. 14
Speaker: Lasse Moldrup
Title: AWDIT: An Optimal Weak Database Isolation Tester
Abstract: In order to achieve low latency, high throughput, and partition tolerance, modern databases forgo strong transaction isolation for weak isolation guarantees. However, several production databases have been found to suffer from isolation bugs, breaking their data-consistency contract. Black-box testing is a prominent technique for detecting isolation bugs, by checking whether histories of database transactions adhere to a prescribed isolation level. The complexity of such testing has recently been shown to be polynomial for weak database isolation levels, but existing testers have a large polynomial complexity, restricting testing to workloads of only moderate size, which is not typical of large-scale databases.
In this work we develop AWDIT, a highly efficient and provably optimal tester for weak database
isolation. Given a history H of size n and k sessions, AWDIT tests whether H satisfies the
most common weak isolation levels of Read Committed (RC), Read Atomic (RA), and Causal Consistency
(CC) in time O(n^(3/2)), O(n^(3/2)), and O(n * k), respectively, improving significantly over
the state of the art. Moreover, we prove that AWDIT is essentially optimal, in the sense that there
is a conditional lower bound of n^{3/2} for any weak isolation level between RC and CC. Our
experiments show that AWDIT is significantly faster than existing, highly optimized testers; e.g.,
for the ~20% largest histories, AWDIT obtains an average speedup of 245x, 193x, and 62x for RC, RA,
and CC, respectively, over the best baseline.
Bio: Lasse Møldrup is a second-year Ph.D. student at Aarhus University, advised by Andreas Pavlogiannis. His research focuses on the intersection of algorithms, complexity theory, and programming languages, particularly in problems related to testing concurrent systems. Lasse’s work typically begins with a performance-critical problem, such as a particular testing task, and asks: what is the theoretical limit on algorithmic efficiency for this problem, and can we design an algorithm that matches it? His research has been accepted to POPL and PLDI, and he was awarded with a Distinguished Paper Award for his PLDI paper.
Nov. 21
Speaker: Eric Chan
Title: TBD
Abstract: TBD
Bio: TBD
Nov. 28
No seminar (Thanksgiving break)
Dec. 5
Speaker: Jayaprabhakar Kadarkarai
Title: TBD
Abstract: TBD
Bio: TBD