Languages, Systems, and Data Seminar (Fall 2024)

Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, Gan Shen, and Reese Levine


The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).

For spring 2024, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.

Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.

Date Speaker Title
Oct 4 Jinsheng Ba Testing Database Engines via Query Plans
Oct 11 Jonathan Castello Inductive diagrams for causal reasoning
Oct 18 Bastian Köpcke Descend: A Safe Imperative GPU Programming Language
Oct 25 Haofan Zheng TBD
Nov 1 Justin Lubin TBD
Nov 8 Bhakti Shah TBD
Nov 15 Federico Mora TBD
Nov 22 Haining Tong TBD
Dec 6 Yanan Guo TBD

Oct 4

Speaker: Jinsheng Ba

Title: Testing Database Engines via Query Plans

Abstract: Database Management Systems (DBMSs) are fundamental software systems that store, maintain, and retrieve data. They are used in almost every personal computer, mobile device, and server. Therefore, it is important to find bugs before they incur severe consequences. Automatic testing is an efficient and effective technique to find crash bugs, which terminate DBMSs, but is struggling to detect logic bugs and performance issues. Logic bugs refer to incorrect results, while performance issues refer to unexpected slow performance. Unlike crash bugs, both categories of bugs do not terminate DBMSs and are hard to observe by existing automatic testing methods. Triggering them requires valid test cases, which are also challenging to generate automatically. In my PhD thesis, I advance automatic testing to efficiently find logic bugs and performance issues in DBMSs. My approaches are united by the idea of leveraging query plans, which are internal representations of how a DBMS executes a query, for automatically testing DBMSs. I put forward the following thesis statement: Query plans allow efficient and effective testing of DBMSs by providing internal execution information. I propose four research works to utilize query plans for testing. First, to detect performance issues, I propose Cardinality Estimation Restriction Testing (CERT), which inspects estimated cardinalities in query plans without measuring execution time. Second, to identify logic bugs, I propose Differential Query Plans (DQP), which inspects the result consistency of multiple query plans of the same query. Third, to generate diverse test cases for exploring target systems thoroughly, I propose Query Plan Guidance (QPG) for guiding the test case generation process towards diverse query plans. Last, observing that query plans cannot be conveniently used as they are exposed in various DBMS-specific representations, I propose a Unified query plan representation (Uplan) based on an empirical study aiming to reduce the effort of building applications based on query plans. Since most DBMSs—including commercial ones—expose query plans to the user, I consider CERT, DQP, QPG, and Uplan generally applicable, black-box approaches for finding logic bugs, performance issues, and building applications on query plans. These methods are effective as they found more than 100 unique and previously unknown bugs in several widely used DBMSs. I view these results as a step towards more reliable DBMSs, and expect this statement of utilizing query plans for testing can be widely adopted to tackle more problems, such as test suite evaluation, debugging deployed DBMSs, and optimization checking.

Bio: Jinsheng Ba is a postdoctoral researcher in the Advanced Software Technologies (AST) Lab at ETH Zurich, mentored by Prof. Zhendong Su. He received his Ph.D. degree in Computer Science from National University of Singapore (NUS) in 2024, advised by Dr. Manuel Rigger. His research interests include software testing, security, and software engineering. His work has been recognized with Distinguished Paper Award twice at ASE 2022 and ICSE 2023. He is also a recipient of NUS Dean’s Graduate Research Excellence Award.

Oct 11

Speaker: Jonathan Castello

Title: Inductive diagrams for causal reasoning

Abstract: Causality diagrams are used throughout academia and industry to informally reason about the possible executions of distributed and concurrent systems. Their formal counterparts, what we’ll call Lamport executions, have been used since at least Lamport’s 1978 introduction of logical clocks. However, Lamport executions are not quite as visceral a model for humans as causality diagrams. This mismatch is exacerbated in modern proof assistants, which are geared toward compositional reasoning over inductive data. In this talk, we’ll present a new formal representation of executions that is both inductively-structured and more similar to the informal diagrams we started with.

Bio: Jonathan is a third-year Ph.D. student at UC Santa Cruz. His research focuses on treating concurrency as something to be started from, rather than something to be introduced to an existing system. This perspective leads to interests in distributed systems, network protocols, and low-level hardware, among other threads(!). In his spare time, he nurtures a continued fascination with all things space.

Oct 18

Speaker: Bastian Köpcke

Title: Descend: A Safe Imperative GPU Programming Language

Abstract: Programming massively parallel hardware such as Graphics Processing Units (GPU) is challenging to get right. Programmers must correctly and efficiently coordinate thousands of threads and their accesses to various shared memory spaces. Existing mainstream GPU programming languages, such as CUDA and OpenCL, are based on C/C++ inheriting their fundamentally unsafe ways to access memory via raw pointers. This facilitates easy to make, but hard to detect bugs, such as data races and deadlocks.

In this talk, I will present Descend, our approach to a safe GPU programming language. In contrast to prior safe high-level GPU programming approaches, Descend is an imperative low-level GPU programming language in the spirit of Rust, enforcing safe CPU and GPU memory management in the type system by tracking Ownership and Lifetimes. Descend introduces a new holistic GPU programming model where computations are hierarchically scheduled over the GPU’s execution resources: grid, blocks, warps, and threads. Descend’s extended Borrow checking ensures that execution resources safely access memory regions without data races. For this, we introduced views describing safe parallel access patterns of memory regions, as well as atomic variables. For memory accesses that cannot be checked by our type system, users can annotate limited code sections as unsafe.

I will discuss the memory safety guarantees offered by Descend at the hands of examples, take a brief look at the formal type system, and show an evaluation of the current implementation using multiple benchmarks, demonstrating that the approach that we take with Descend is capable of expressing real-world GPU programs showing competitive performance compared to manually written CUDA programs that are lacking Descend’s safety guarantees.

This talk will be based on work that appeared at PLDI 2024: https://dl.acm.org/doi/10.1145/3656411

Bio: Bastian is a Computer Science PhD student in the Parallel and Distributed Systems research group at the University of Münster, working closely with Michel Steuwer’s programming language and compiler group at TU Berlin. He is interested in programming languages, verification and compiler techniques to enable safer and more productive programming of parallel hardware.

Oct 25

Speaker: Haofan Zheng

Title: TBD

Abstract: TBD

Bio: TBD

Nov 1

Speaker: Justin Lubin

Title: TBD

Abstract: TBD

Bio: TBD

Nov 8

Speaker: Bhakti Shah

Title: TBD

Abstract: TBD

Bio: TBD

Nov 15

Speaker: Federico Mora

Title: TBD

Abstract: TBD

Bio: TBD

Nov 22

Speaker: Haining Tong

Title: TBD

Abstract: TBD

Bio: TBD

Dec 6

Speaker: Yanan Guo

Title: TBD

Abstract: TBD

Bio: TBD


Archive