Recent events

Human Factors in Secure Software Development: How we can help developers write secure code

Yasemin Acar MPI-SP
11 Mar 2021, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
We are seeing a persistent gap between the theoretical security of e.g. cryptographic algorithms and real world vulnerabilities, data-breaches and possible attacks. Software developers – despite being computer experts – are rarely security experts, and security and privacy are usually, at best, of secondary importance for them. They may not have training in security and privacy or even be aware of the possible implications, and they may be unable to allocate time or effort to ensure that security and privacy best practices and design principles are upheld for their end users. …
We are seeing a persistent gap between the theoretical security of e.g. cryptographic algorithms and real world vulnerabilities, data-breaches and possible attacks. Software developers – despite being computer experts – are rarely security experts, and security and privacy are usually, at best, of secondary importance for them. They may not have training in security and privacy or even be aware of the possible implications, and they may be unable to allocate time or effort to ensure that security and privacy best practices and design principles are upheld for their end users. Understanding their education and mindsets, their processes, the tools that they use, and their pitfalls are the foundation for shifting development practices to be more secure. This talk will give an overview of security challenges for developers, and interdisciplinary research avenues to address these.

--

Please contact MPI-SWS Office team for link information.
Read more

Automatic Vulnerability Discovery at Scale

Marcel Böhme Monash University, Australia
09 Mar 2021, 10:00 am - 11:00 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
To establish software security at scale, we need efficient automated vulnerability discovery techniques that can run on thousands of machines. In this talk, we will discuss the abundant opportunities and fundamental limitations of fuzzing, one of the most successful vulnerability discovery techniques. We will explore why only an exponential number of machines will allow us to discover software bugs at a linear rate. We will discuss the kind of correctness guarantees that we can expect from automatic vulnerability discovery, …
To establish software security at scale, we need efficient automated vulnerability discovery techniques that can run on thousands of machines. In this talk, we will discuss the abundant opportunities and fundamental limitations of fuzzing, one of the most successful vulnerability discovery techniques. We will explore why only an exponential number of machines will allow us to discover software bugs at a linear rate. We will discuss the kind of correctness guarantees that we can expect from automatic vulnerability discovery, anywhere from formally proving the absence of bugs to statistical claims about program correctness. We shall touch upon unexpected connections to ecological biostatistics and information theory which allow us to address long-standing scientific and practical problems in automatic software testing. Finally, we will take a forward looking view and discuss our larger vision for the field of software security.

--

Please contact MPI-SWS Office for Zoom link information
Read more

Exterminating bugs in real systems

Fraser Brown Stanford
08 Mar 2021, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Software is everywhere, and almost everywhere, software is broken. Some bugs just crash your printer; others hand an identity thief your bank account number; still others let nation-states spy on dissidents and persecute minorities. This talk outlines my work preventing bugs using a blend of programming languages techniques and systems design. First, I'll talk about securing massive, security-critical codebases without clean slate rewrites. This means rooting out hard-to-find bugs---as in Sys, which scales symbolic execution to find exploitable bugs in systems like the twenty-million line Chrome browser. …
Software is everywhere, and almost everywhere, software is broken. Some bugs just crash your printer; others hand an identity thief your bank account number; still others let nation-states spy on dissidents and persecute minorities. This talk outlines my work preventing bugs using a blend of programming languages techniques and systems design. First, I'll talk about securing massive, security-critical codebases without clean slate rewrites. This means rooting out hard-to-find bugs---as in Sys, which scales symbolic execution to find exploitable bugs in systems like the twenty-million line Chrome browser. It also means proving correctness of especially vulnerable pieces of code---as in VeRA, which automatically verifies part of the Firefox JavaScript engine. Finally, I'll discuss work on stronger foundations for new systems---as in CirC, a recent project unifying compiler infrastructure for program verification, cryptographic proofs, optimization problems, and more.

--

Please contact MPI-SWS Office for link information
Read more

Advancing Visual Intelligence via Neural System Design

Hengshuang Zhao University of Oxford
05 Mar 2021, 9:30 am - 10:30 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Building intelligent visual systems is essential for the next generation of artificial intelligence systems. It is a fundamental tool for many disciplines and beneficial to various potential applications such as autonomous driving, robotics, surveillance, augmented reality, to name a few. An accurate and efficient intelligent visual system has a deep understanding of the scene, objects, and humans. It can automatically understand the surrounding scenes. In general, 2D images and 3D point clouds are the two most common data representations in our daily life. …
Building intelligent visual systems is essential for the next generation of artificial intelligence systems. It is a fundamental tool for many disciplines and beneficial to various potential applications such as autonomous driving, robotics, surveillance, augmented reality, to name a few. An accurate and efficient intelligent visual system has a deep understanding of the scene, objects, and humans. It can automatically understand the surrounding scenes. In general, 2D images and 3D point clouds are the two most common data representations in our daily life. Designing powerful image understanding and point cloud processing systems are two pillars of visual intelligence, enabling the artificial intelligence systems to understand and interact with the current status of the environment automatically. In this talk, I will first present our efforts in designing modern neural systems for 2D image understanding, including high-accuracy and high-efficiency semantic parsing structures, and unified panoptic parsing architecture. Then, we go one step further to design neural systems for processing complex 3D scenes, including semantic-level and instance-level understanding. Further, we show our latest works for unified 2D-3D reasoning frameworks, which are fully based on self-attention mechanisms. In the end, the challenges, up-to-date progress, and promising future directions for building advanced intelligent visual systems will be discussed.

--

Please contact MPI-SWS office for link information.
Read more

Post-Moore Systems — Challenges and Opportunities

Antoine Kaufmann Max Planck Institute for Software Systems
04 Mar 2021, 10:30 am - 11:30 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Satisfying the growing demand for more compute as processor performance continues to stagnate in the post-Moore era requires radical changes throughout the systems stack. A proven strategy is to move from today’s general purpose platforms to post-Moore systems, specialized systems comprising tightly integrated and co-designed hardware and software components. Currently, designing and implementing these systems is a complex, laborious, and risky process, accessible to few and only practical for the most computing intensive applications. In this talk, …
Satisfying the growing demand for more compute as processor performance continues to stagnate in the post-Moore era requires radical changes throughout the systems stack. A proven strategy is to move from today’s general purpose platforms to post-Moore systems, specialized systems comprising tightly integrated and co-designed hardware and software components. Currently, designing and implementing these systems is a complex, laborious, and risky process, accessible to few and only practical for the most computing intensive applications. In this talk, I discuss the nascent challenges in building post-Moore systems and illustrate them through specific systems I have built. As a first step to address these challenges, I present Simbricks, a modular simulation framework that flexibly combines existing simulators for computers, custom hardware, and networks, into complete virtual post-Moore systems, enabling developers to compare and evaluate designs earlier and quicker. I conclude with a look towards future opportunities in abstractions, tooling, and methodology to further simplify the development of post-Moore systems.



Please contact MPI-SWS Office for link information.
Read more

Model Checking Under Weak Memory Concurrency

Viktor Vafeiadis Max Planck Institute for Software Systems
03 Mar 2021, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
Model checking is an automatic approach for testing and verifying programs, and has proven to be very effective in a number of settings ranging from hardware designs to intricate low-level systems software. In this talk, I will present our recent research on applying model checking to weakly consistent concurrent programs. I will explain the key challenges involved in making model checking effective in this setting and how to address them.

---

Join Zoom Meeting https://zoom.us/j/91226662016?pwd=UmdLZ01WRW1YUnVmZ0NjQXVBSDQrdz09

Meeting ID: 912 2666 2016 Passcode: jls-20-21
Model checking is an automatic approach for testing and verifying programs, and has proven to be very effective in a number of settings ranging from hardware designs to intricate low-level systems software. In this talk, I will present our recent research on applying model checking to weakly consistent concurrent programs. I will explain the key challenges involved in making model checking effective in this setting and how to address them.

---

Join Zoom Meeting https://zoom.us/j/91226662016?pwd=UmdLZ01WRW1YUnVmZ0NjQXVBSDQrdz09

Meeting ID: 912 2666 2016 Passcode: jls-20-21

New Advances in (Adversarially) Robust and Secure Machine Learning

Hongyang Zhang Toyota Technological Institute at Chicago
03 Mar 2021, 9:30 am - 10:30 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
In this talk, I will describe a distributionally robust learning framework that offers accurate uncertainty quantification and rigorous guarantees under data distribution shift. This framework yields appropriately conservative yet still accurate predictions to guide real-world decision-making and is easily integrated with modern deep learning. I will showcase the practicality of this framework in applications on agile robotic control and computer vision. I will also introduce a survey of other real-world applications that would benefit from this framework for future work. …
In this talk, I will describe a distributionally robust learning framework that offers accurate uncertainty quantification and rigorous guarantees under data distribution shift. This framework yields appropriately conservative yet still accurate predictions to guide real-world decision-making and is easily integrated with modern deep learning. I will showcase the practicality of this framework in applications on agile robotic control and computer vision. I will also introduce a survey of other real-world applications that would benefit from this framework for future work.

--

Please contact MPI-SWS Office for Zoom link information
Read more

Towards Trustworthy AI: Provably Robust Extrapolation for Decision Making

Anqi Liu California Institute of Technology
02 Mar 2021, 5:00 pm - 6:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
To create trustworthy AI systems, we must safeguard machine learning methods from catastrophic failures. For example, we must account for the uncertainty and guarantee the performance for safety-critical systems, like in autonomous driving and health care, before deploying them in the real world. A key challenge in such real-world applications is that the test cases are not well represented by the pre-collected training data. To properly leverage learning in such domains, we must go beyond the conventional learning paradigm of maximizing average prediction accuracy with generalization guarantees that rely on strong distributional relationships between training and test examples. …
To create trustworthy AI systems, we must safeguard machine learning methods from catastrophic failures. For example, we must account for the uncertainty and guarantee the performance for safety-critical systems, like in autonomous driving and health care, before deploying them in the real world. A key challenge in such real-world applications is that the test cases are not well represented by the pre-collected training data. To properly leverage learning in such domains, we must go beyond the conventional learning paradigm of maximizing average prediction accuracy with generalization guarantees that rely on strong distributional relationships between training and test examples. In this talk, I will describe a distributionally robust learning framework that offers accurate uncertainty quantification and rigorous guarantees under data distribution shift. This framework yields appropriately conservative yet still accurate predictions to guide real-world decision-making and is easily integrated with modern deep learning. I will showcase the practicality of this framework in applications on agile robotic control and computer vision. I will also introduce a survey of other real-world applications that would benefit from this framework for future work.

--

Please contact MPI-SWS office team for Zoom link information
Read more

Opening the Black Box: Towards Theoretical Understanding of Deep Learning

Hu Wei Princeton University, USA
01 Mar 2021, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Despite the phenomenal empirical successes of deep learning in many application domains, its underlying mathematical mechanisms remain poorly understood. Mysteriously, deep neural networks in practice can often fit training data perfectly and generalize remarkably well to unseen test data, despite highly non-convex optimization landscapes and significant over-parameterization. Moreover, deep neural networks show extraordinary ability to perform representation learning: feature representation extracted from a trained neural network can be useful for other related tasks.

In this talk, …
Despite the phenomenal empirical successes of deep learning in many application domains, its underlying mathematical mechanisms remain poorly understood. Mysteriously, deep neural networks in practice can often fit training data perfectly and generalize remarkably well to unseen test data, despite highly non-convex optimization landscapes and significant over-parameterization. Moreover, deep neural networks show extraordinary ability to perform representation learning: feature representation extracted from a trained neural network can be useful for other related tasks.

In this talk, I will present our recent progress on building the theoretical foundations of deep learning, by opening the black box of the interactions among data, model architecture, and training algorithm. First, I will show that gradient descent on deep linear neural networks induces an implicit regularization effect towards low rank, which explains the surprising generalization behavior of deep linear networks for the low-rank matrix completion problem. Next, turning to nonlinear deep neural networks, I will talk about a line of studies on wide neural networks, where by drawing a connection to the neural tangent kernels, we can answer various questions such as how training loss is minimized, why trained network can generalize, and why certain component in the network architecture is useful; we also use theoretical insights to design a new simple and effective method for training on noisily labeled datasets. Finally, I will analyze the statistical aspect of representation learning, and identify key data conditions that enable efficient use of training data, bypassing a known hurdle in the i.i.d. tasks setting.

--

Please contact the MPI-SWS office team for link information.
Read more

Measuring and Enhancing the Security of Machine Learning

Florian Tramer Stanford
25 Feb 2021, 5:00 pm - 6:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Failures of machine learning systems can threaten both the security and privacy of their users. My research studies these failures from an adversarial perspective, by building new attacks that highlight critical vulnerabilities in the machine learning pipeline, and designing new defenses that protect users against identified threats. In the first part of this talk, I'll explain why machine learning models are so vulnerable to adversarially chosen inputs. I'll show that many proposed defenses are ineffective and cannot protect models deployed in overtly adversarial settings, …
Failures of machine learning systems can threaten both the security and privacy of their users. My research studies these failures from an adversarial perspective, by building new attacks that highlight critical vulnerabilities in the machine learning pipeline, and designing new defenses that protect users against identified threats. In the first part of this talk, I'll explain why machine learning models are so vulnerable to adversarially chosen inputs. I'll show that many proposed defenses are ineffective and cannot protect models deployed in overtly adversarial settings, such as for content moderation on the Web. In the second part of the talk, I'll focus on the issue of data privacy in machine learning systems, and I'll demonstrate how to enhance privacy by combining techniques from cryptography, statistics, and computer security.

--

Please contact MPI-SWS Office for link information
Read more

Data-Centric Debugging or: How I Learned to Stop Worrying and Use 'Big Data' Techniques to Diagnose Software Bugs

Andrew Quinn University of Michigan
24 Feb 2021, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Software bugs are pervasive and costly.  As a result, developers spend the majority of their time debugging their software.  Traditionally, debugging involves inspecting and tracking the runtime behavior of a program.  Alas, program inspection is computationally expensive, especially when employing powerful techniques such as dynamic information flow tracking, data-race detection, and data structure invariant checks.  Moreover, debugging logic is difficult to specify correctly.  Current tools (e.g., gdb, Intel Pin) allow developers to write debugging logic in an imperative inline programming model that mirrors the programming style of traditional software.  …
Software bugs are pervasive and costly.  As a result, developers spend the majority of their time debugging their software.  Traditionally, debugging involves inspecting and tracking the runtime behavior of a program.  Alas, program inspection is computationally expensive, especially when employing powerful techniques such as dynamic information flow tracking, data-race detection, and data structure invariant checks.  Moreover, debugging logic is difficult to specify correctly.  Current tools (e.g., gdb, Intel Pin) allow developers to write debugging logic in an imperative inline programming model that mirrors the programming style of traditional software.  So, debugging logic faces the same challenges as traditional software, including concurrency, fault handling, dynamic memory, and extensibility.  In general, specifying debugging logic can be as difficult as writing the program being debugged!

In this talk, I will describe a new data-centric debugging framework that alleviates the performance and specification limitations of current debugging models.  The key idea is to use deterministic record and replay to treat a program execution as a massive data object consisting of all program states reached during the execution.  In this framework, developers can express common debugging tasks (e.g., tracking the value of a variable) and dynamic analyses (e.g., data-race detection) as queries over an execution's data object.  My research explores how a data-centric model enables large-scale parallelism to accelerate debugging queries (JetStream and SledgeHammer) and relational query models to simplify the specification of debugging logic (SledgeHammer and the OmniTable Query Model).

--

Please contact MPI-SWS office for Zoom link information
Read more

Towards an Actionable Understanding of Conversations

Justine Zhang Cornell University
23 Feb 2021, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Conversations are central to our social systems. Understanding how conversationalists navigate through them could unlock great improvements in domains like public health, where the provision of social support is crucial. To this end, I develop computational frameworks that can capture and systematically examine aspects of conversations that are difficult, interesting and meaningful for conversationalists and the jobs they do. Importantly, these frameworks aim to yield actionable understandings—ones that reflect the choices that conversationalists make and their consequences, …
Conversations are central to our social systems. Understanding how conversationalists navigate through them could unlock great improvements in domains like public health, where the provision of social support is crucial. To this end, I develop computational frameworks that can capture and systematically examine aspects of conversations that are difficult, interesting and meaningful for conversationalists and the jobs they do. Importantly, these frameworks aim to yield actionable understandings—ones that reflect the choices that conversationalists make and their consequences, beyond the inert linguistic patterns that are produced in the interaction.

Please contact MPI-SWS Office for link information.
Read more

Algorithmic Approaches in Finite-ModelTheory With Interdisciplinary Applications

Sandra Kiefer RWTH Aachen University
22 Feb 2021, 10:30 am - 11:30 am
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Graphs are widespread models for relations between entities. One of the fundamental problems when dealing with graphs is to decide isomorphism, i.e., to check whether two graphs are structurally identical. Even after decades of research, the quest for an efficient graph-isomorphism test still continues. In this talk, I will discuss the Weisfeiler-Leman (WL) algorithm as a powerful combinatorial procedure to approach the graph-isomorphism problem. The algorithm can be seen as a link between many research areas (the ""WL net""), …
Graphs are widespread models for relations between entities. One of the fundamental problems when dealing with graphs is to decide isomorphism, i.e., to check whether two graphs are structurally identical. Even after decades of research, the quest for an efficient graph-isomorphism test still continues. In this talk, I will discuss the Weisfeiler-Leman (WL) algorithm as a powerful combinatorial procedure to approach the graph-isomorphism problem. The algorithm can be seen as a link between many research areas (the ""WL net""), including, for example, descriptive complexity theory, propositional proof complexity, and machine learning. I will present work regarding the two central parameters of the algorithm – its dimension and the number of iterations – and explore their connection to finite-model theory. I will also touch on some past and ongoing projects in other areas from the WL net.

--

Please contact MPI-SWS office team for link information.
Read more

What Models do we Need in Computer Vision? From Optical Flow to Scene Representations

Eddy Ilg University of Freiburg, Germany
18 Feb 2021, 4:00 pm - 5:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Deep learning today is successful in almost any domain of computer vision. The talk will revisit the seminal work of FlowNet to show how deep learning was applied to optical flow and led to a paradigm shift in this domain. Optical flow, disparity, motion and depth boundaries as well as uncertainty estimation with multi-hypothesis networks will be covered and it will be discussed how deep learned models could surpass traditional methods. Asking the more fundamental question what models we need in computer vision, …
Deep learning today is successful in almost any domain of computer vision. The talk will revisit the seminal work of FlowNet to show how deep learning was applied to optical flow and led to a paradigm shift in this domain. Optical flow, disparity, motion and depth boundaries as well as uncertainty estimation with multi-hypothesis networks will be covered and it will be discussed how deep learned models could surpass traditional methods. Asking the more fundamental question what models we need in computer vision, the talk will then progress to recent deep-learned scene representation approaches such as the ones obtained by learned signed distance functions and NeRF and provide a perspective on how computer vision might change in the future.

--

Please contact the MPI-SWS office team for link information.
Read more

Breaking the chains of implicit trust

Riad Wahby Stanford
17 Feb 2021, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
The success of today's hardware and software systems is due in part to a mature toolbox of techniques, like abstraction, that systems designers use to manage complexity. While powerful, these techniques are also subtly dangerous: they induce implicit trust relationships among system components and between related systems, presenting attackers with many opportunities to undermine the integrity of our hardware and software. This talk discusses an approach to building systems with precise control over trust, drawing on techniques from theoretical computer science. …
The success of today's hardware and software systems is due in part to a mature toolbox of techniques, like abstraction, that systems designers use to manage complexity. While powerful, these techniques are also subtly dangerous: they induce implicit trust relationships among system components and between related systems, presenting attackers with many opportunities to undermine the integrity of our hardware and software. This talk discusses an approach to building systems with precise control over trust, drawing on techniques from theoretical computer science. Making this approach practical is a challenge that requires innovation across the entire technology stack, from hardware to theory. I will present several examples of such innovations from my research and describe a few potential directions for future work.

--

Please contact MPI-SWS Office Team for link information
Read more

Using Data More Responsibly

Juba Ziani University of Pennsylvania
16 Feb 2021, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Data is now everywhere: enormous amounts of data are produced and processed every day. Data is gathered,exchanged, and used extensively in computations that serve many purposes: e.g., computing statistics on populations, refining bidding strategies in ad auctions, improving recommendation systems, and making loan or hiring decisions. Yet, data is not always transacted and processed in a responsible manner. Data collection often happens without the data holders' consent, who may also not be compensated for their data. …
Data is now everywhere: enormous amounts of data are produced and processed every day. Data is gathered,exchanged, and used extensively in computations that serve many purposes: e.g., computing statistics on populations, refining bidding strategies in ad auctions, improving recommendation systems, and making loan or hiring decisions. Yet, data is not always transacted and processed in a responsible manner. Data collection often happens without the data holders' consent, who may also not be compensated for their data. Privacy leaks are numerous, exhibiting a need for better privacy protections on personal and sensitive data. Data-driven machine learning and decision making algorithms have been shown to both mimic past bias and to introduce additional bias in their predictions, leading to inequalities and discrimination. In this talk, I will focus on my research on using data in a more responsible manner. The main focus of the talk will be on my work on the privacy issues that arise in data transactions and data-driven analysis, under the lens of a framework known as differential privacy. I will go over my work on designing transactions for data where we provide differential privacy guarantees to the individuals whose sensitive data we are buying and using in computations, and will focus on my recent work on providing differential privacy to agents in auction settings, where it is natural to want to protect the valuations and bids of said agents. I will also give a brief overview of the other directions that I have followed in my research, both on the optimization and economic challenges that arise when letting agents opt in and out of data sharing and compensating them sufficiently for their data contributions, and on how to reduce the disparate and discriminatory impact of data-driven decision-making.

--

Please contact MPI-SWS office team for Zoom link information
Read more

Building Scalable Network Stacks for Modern Applications

Ahmed Saeed MIT
15 Feb 2021, 2:00 pm - 3:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
The network stack in today's operating systems is a remnant from a time when a server had a handful of cores and processed requests from a few thousand clients. It simply cannot keep up with the scale of modern servers and the requirements of modern applications. Specifically, real-time applications and high user expectations enforce strict performance requirements on the infrastructure. Further, there is a fundamental shift in the way hardware capacity scales from simply relying on Moore's law to deliver faster hardware every couple of years to leveraging parallel processing and task-specific accelerators. …
The network stack in today's operating systems is a remnant from a time when a server had a handful of cores and processed requests from a few thousand clients. It simply cannot keep up with the scale of modern servers and the requirements of modern applications. Specifically, real-time applications and high user expectations enforce strict performance requirements on the infrastructure. Further, there is a fundamental shift in the way hardware capacity scales from simply relying on Moore's law to deliver faster hardware every couple of years to leveraging parallel processing and task-specific accelerators. This talk covers innovations in three key components of the network stack. First, I will cover my work on scalable packet scheduling in software network stacks, improving the control of traffic outgoing from large-scale servers. Second, I will move on to my work on improving overload control for servers handling microsecond-scale remote procedure calls, providing better control over incoming traffic to large-scale servers. Then, the talk covers my work on Wide Area Network (WAN) congestion control, focusing on network-assisted congestion control schemes, where end-to-end solutions fail. The talk will conclude with a discussion of plans for future research in this area.

--

Please contact MPI-SWS office team for Zoom link information
Read more

Data-Driven Transfer of Insight between Brains and AI Systems

Mariya Toneva Carnegie Mellon University, USA
11 Feb 2021, 3:00 pm - 4:00 pm
Virtual talk
CIS@MPG Tenure-Track Faculty Recruiting
Several major innovations in artificial intelligence (AI) (e.g. convolutional neural networks, experience replay) are based on findings about the brain. However, the underlying brain findings took many years to first consolidate and many more to transfer to AI. Moreover, these findings were made using invasive methods in non-human species. For cognitive functions that are uniquely human, such as natural language processing, there is no suitable model organism and a mechanistic understanding is that much farther away. …
Several major innovations in artificial intelligence (AI) (e.g. convolutional neural networks, experience replay) are based on findings about the brain. However, the underlying brain findings took many years to first consolidate and many more to transfer to AI. Moreover, these findings were made using invasive methods in non-human species. For cognitive functions that are uniquely human, such as natural language processing, there is no suitable model organism and a mechanistic understanding is that much farther away. In this talk, I will present my research program that circumvents these limitations by establishing a direct connection between the human brain and AI systems with two main goals: 1) to improve the generalization performance of AI systems and 2) to improve our mechanistic understanding of cognitive functions. Lastly, I will discuss future directions that build on these approaches to investigate the role of memory in meaning composition, both in the brain and AI. This investigation will lead to methods that can be applied to a wide range of AI domains, in which it is important to adapt to new data distributions, continually learn to perform new tasks, and learn from few samples.

Please contact MPI-SWS Office Team for link information
Read more

A Social Network under Social Distancing - Experience and Insights during COVID-19 on Risk-Driven Backbone Management

Yiting Xia MPI-INF - D3
03 Feb 2021, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
This talk reports Facebook's experience of managing the backbone network during the COVID-19 crisis. Our philosophy centers around "risk prevention" to identify potential failures in the network and mitigate their effects. We define metrics for network risk and quantify the impact of COVID-19 with them. We also describe a risk assessment system that has been in production for three years, which involves accurate failure modeling and efficient risk simulation. With ten months of assessment results, we claim our backbone to be robust against the COVID-19 stress test, …
This talk reports Facebook's experience of managing the backbone network during the COVID-19 crisis. Our philosophy centers around "risk prevention" to identify potential failures in the network and mitigate their effects. We define metrics for network risk and quantify the impact of COVID-19 with them. We also describe a risk assessment system that has been in production for three years, which involves accurate failure modeling and efficient risk simulation. With ten months of assessment results, we claim our backbone to be robust against the COVID-19 stress test, achieving high service availability and low routing dilation. We share our operational measures to minimize possible traffic loss. Surprising findings during this period give us insights to further improve our approach. First, we observe a substantial reduction of optical failures because of less human activity, which inspires failure prediction to trade model stability for agility by considering short-term failure statistics when necessary. Second, we find a negative correlation between network traffic and human mobility, indicating non- networking signals traditionally ignored can be used for better network management.
Read more

Vellvm: Verifying LLVM IR Code

Steve Zdancewic University of Pennsylvania
13 Jan 2021, 3:00 pm - 4:00 pm
Virtual talk
SWS Distinguished Lecture Series
LLVM is an industrial-strength compiler that's used for everything from day-to-day iOS development (in Swift) to pie-in-the-sky academic research projects. This makes the LLVM framework a sweet spot for bug-finding and verification technologies--any improvements to it are amplified across its many applications.

This talk asks the question: what does LLVM code _mean_, and, how can we ensure that LLVM-based tools (compilers, optimizers, code instrumentation passes, etc.) do what they're supposed to -- especially for safety- or security-critical applications? …
LLVM is an industrial-strength compiler that's used for everything from day-to-day iOS development (in Swift) to pie-in-the-sky academic research projects. This makes the LLVM framework a sweet spot for bug-finding and verification technologies--any improvements to it are amplified across its many applications.

This talk asks the question: what does LLVM code _mean_, and, how can we ensure that LLVM-based tools (compilers, optimizers, code instrumentation passes, etc.) do what they're supposed to -- especially for safety- or security-critical applications? The Verified LLVM project (Vellvm) is our attempt to provide an answer. Vellvm gives a semantics to LLVM IR programs in the Coq interactive theorem prover, which can be used for developing machine-checkable formal properties about LLVM IR programs and transformation passes.

Our approach to modeling LLVM IR semantics uses _interaction trees_, a data structure that is suitable for representing impure, possibly nonterminating programs in dependent type theory. Interaction trees support compositional and modular reasoning about program semantics but are also executable, and hence useful for implementation and testing. We'll see how interaction trees are used in Vellvm and, along the way, we'll get a taste of what LLVM code looks like including some of its trickier semantic aspects. We'll also see (at a high level) how modern interactive theorem provers--in this case, Coq--can be used to verify compiler transformations.

No experience with LLVM or formal verification technologies will be assumed.

--

Please contact office for the Zoom details.
Read more

The Programmer, The Unknown Being: Program Comprehension Research in the Neuroage

Sven Apel Fachrichtung Informatik - Saarbrücken
13 Jan 2021, 12:15 pm - 1:15 pm
Virtual talk
Joint Lecture Series
The pivotal role of software in our modern world imposes strong requirements on quality, correctness, and reliability of software systems. The ability to understand program code plays a key role for programmers to fulfill these requirements. Despite significant progress, research on program comprehension has had a fundamental limitation: program comprehension is a cognitive process that cannot be directly observed, which leaves considerable room for (mis)interpretation, uncertainty, and confounding factors. Thus, central questions such as "What makes a good programmer?" and "How should we program?" are surprisingly difficult to answer based on the state of the art. …
The pivotal role of software in our modern world imposes strong requirements on quality, correctness, and reliability of software systems. The ability to understand program code plays a key role for programmers to fulfill these requirements. Despite significant progress, research on program comprehension has had a fundamental limitation: program comprehension is a cognitive process that cannot be directly observed, which leaves considerable room for (mis)interpretation, uncertainty, and confounding factors. Thus, central questions such as "What makes a good programmer?" and "How should we program?" are surprisingly difficult to answer based on the state of the art.

In this talk, I will report on recent attempts to lift research on program comprehension to a new level. The key idea is to leverage methods from cognitive neuroscience to obtain insights into the cognitive processes involved in program comprehension. Opening the "black box" of human cognition will lead to a breakthrough in understanding the why and how of program comprehension and to a completely new perspective and methodology of measuring program comprehension, with direct implications for programming methodology, language design, and education.
Read more

Distributed synthesis and negotiations

Anca Muscholl Universite de Bordeaux
04 Dec 2020, 10:00 am - 11:00 am
Virtual talk
SWS Distinguished Lecture Series
This talk will be a survey of instances of the synthesis problem in distributed models with rendez-vous synchronization. I will talk about synthesis for distributed automata within the theory of Mazurkiewicz traces and about the simpler model of negotiation diagrams (aka as workflow nets). 

---

Please contact Office for Zoom details. 

Towards Human Behavior Modeling from (Big) Data: From smart rooms, cars and phones to COVID-19

Nuria Oliver ELLIS
02 Dec 2020, 4:00 pm - 5:00 pm
Virtual talk
SWS Distinguished Lecture Series
Human Behavior Modeling and Understanding is a key challenge in the development of intelligent systems. In my talk I will describe a few of the projects that I have carried out over the course of the past 25 years to address this challenge. In particular, I will give an overview of my work on Smart Rooms (real-time facial expression recognition and visual surveillance), Smart Cars (driver maneuver recognition), Smart Offices (multi-modal office activity recognition), Smart Mobile Phones (boredom inference) and finally a Smart World (pandemics and specifically COVID-19). …
Human Behavior Modeling and Understanding is a key challenge in the development of intelligent systems. In my talk I will describe a few of the projects that I have carried out over the course of the past 25 years to address this challenge. In particular, I will give an overview of my work on Smart Rooms (real-time facial expression recognition and visual surveillance), Smart Cars (driver maneuver recognition), Smart Offices (multi-modal office activity recognition), Smart Mobile Phones (boredom inference) and finally a Smart World (pandemics and specifically COVID-19). On this last area, I will share the lessons learned during the past 8 months as Commissioner to the President of the Valencian Government in Spain on AI and Data Science against COVID-19. It is a very special initiative of collaboration between the civil society at large (through the survey), the scientific community and a public administration (at the President level)

--

Please contact office for Zoom link details
Read more

Internet Measurements: From IPv6 Scanning to the COVID-19 Pandemic

Oliver Gasser MPI-INF - D3
02 Dec 2020, 12:15 pm - 1:30 pm
Virtual talk
Joint Lecture Series
Internet measurements are an important tool to analyze the deployment of protocols, evaluate the performance, resilience, and security of Internet hosts, and to generally better understand the current state of the Internet. In this talk we will introduce the concept of Internet measurements and show its applications, ranging from techniques for scanning the IPv6 Internet, to analyzing the impact of the COVID-19 pandemic on Internet traffic. We show that overall, traffic volumes increased after lockdowns have been imposed. …
Internet measurements are an important tool to analyze the deployment of protocols, evaluate the performance, resilience, and security of Internet hosts, and to generally better understand the current state of the Internet. In this talk we will introduce the concept of Internet measurements and show its applications, ranging from techniques for scanning the IPv6 Internet, to analyzing the impact of the COVID-19 pandemic on Internet traffic. We show that overall, traffic volumes increased after lockdowns have been imposed. Furthermore, traffic of remote working and video conferencing applications increases, whereas other parts of the Internet traffic decrease during the pandemic.
Read more

High-Throughput and Predictable VM Scheduling for High-Density Workloads

Manohar Vanga Max Planck Institute for Software Systems
26 Nov 2020, 3:30 pm - 4:30 pm
Kaiserslautern building G26, room 111
SWS Student Defense Talks - Thesis Defense
In the increasingly competitive public-cloud marketplace, improving the efficiency of data centers is a major concern. One way to improve efficiency is to consolidate as many VMs onto as few physical cores as possible, provided that performance expectations are not violated. However, as a prerequisite for increased VM densities, the hypervisor’s VM scheduler must allocate processor time efficiently and in a timely fashion. As we show in this thesis, contemporary VM schedulers leave substantial room for improvements in both regards when facing challenging high-VM-density workloads that frequently trigger the VM scheduler. …
In the increasingly competitive public-cloud marketplace, improving the efficiency of data centers is a major concern. One way to improve efficiency is to consolidate as many VMs onto as few physical cores as possible, provided that performance expectations are not violated. However, as a prerequisite for increased VM densities, the hypervisor’s VM scheduler must allocate processor time efficiently and in a timely fashion. As we show in this thesis, contemporary VM schedulers leave substantial room for improvements in both regards when facing challenging high-VM-density workloads that frequently trigger the VM scheduler.

As root causes, we identify (i) high runtime overheads and (ii) unpredictable scheduling heuristics. To better support high VM densities, we propose Tableau, a VM scheduler that guarantees a minimum processor share and a maximum bound on scheduling delay for every VM in the system. Tableau combines a low-overhead, core-local, table-driven dispatcher with a fast on-demand table-generation procedure (triggered on VM creation/teardown) that employs scheduling techniques typically used in hard real-time systems. In an evaluation of Tableau and three current Xen schedulers on a 16-core Intel Xeon machine, Tableau is shown to improve tail latency (e.g., a 17x reduction in maximum ping latency compared to Credit) and throughput (e.g., 1.6x peak web server throughput compared to RTDS when serving 1 KiB files with a 100 ms SLA). Further, we show that, owing to its focus on efficiency and scalability, Tableau provides comparable or better throughput than existing Xen schedulers in dedicated-core scenarios as are commonly employed in public clouds today.

Another common requirement in public clouds is the ability to use idle cycles in the system to perform low-priority background work, without affecting the performance of primary VMs (which are typically paid for by customers). The primary obstacle to achieving this is the lack of strong performance guarantees for VMs, which Tableau provides. We present the design of a background scheduler that enables a lower-priority class of VMs to use any idle cycles in the system, and present results showing that they have low impact on the performance of table-driven VMs, making it practical for cloud environments.

Finally, VM churn and workload variations in multi-tenant public clouds result in changing interference patterns at runtime, resulting in performance variation. In particular, variation in last-level cache (LLC) interference has been shown to have a significant impact on virtualized application performance in cloud environments. We present a novel approach for dealing with such dynamically changing interference, which involves periodically regenerating tables that provide the same guarantees on utilization and scheduling latency for all VMs in the system, but have different LLC interference characteristics.

We present two strategies to mitigate LLC interference: a randomized approach, and one that uses performance counters to detect VMs running cache-intensive workloads and selectively mitigate interference. Our results show that randomizing tables works well for mitigating worst-case slowdowns due to cache interference, while the performance-counter-based approach requires a more robust mechanism for detecting interfering VMs in order to match the performance of the randomized approach.
Read more

Can You Believe It? Security and Privacy Case Studies in Online Advertising, Misinformation, and Augmented Reality

Franziska Roesner University of Washingtion
24 Nov 2020, 4:00 pm - 5:00 pm
Virtual talk
SWS Distinguished Lecture Series
People who use modern technologies are inundated with content and information from many sources, including advertisements on the web, posts on social media, and (looking to the future) content in augmented or virtual reality. While these technologies are transforming our lives and communications in many positive ways, they also come with serious risks to users’ security, privacy, and the trustworthiness of content they see: the online advertising ecosystem tracks individual users and may serve misleading or deceptive ads, …
People who use modern technologies are inundated with content and information from many sources, including advertisements on the web, posts on social media, and (looking to the future) content in augmented or virtual reality. While these technologies are transforming our lives and communications in many positive ways, they also come with serious risks to users’ security, privacy, and the trustworthiness of content they see: the online advertising ecosystem tracks individual users and may serve misleading or deceptive ads, social media feeds are full of potential mis/disinformation, and emerging augmented reality technologies can directly modify users’ perceptions of the physical world in undesirable ways. In this talk, I will discuss several lines of research from our lab that explore these issues from a broad computer security and privacy perspective, leveraging methodologies ranging from qualitative user studies to systematic measurement studies to system design and evaluation. What unites these efforts is a key question: what can (or do) users believe about the content they receive through our existing and emerging technologies, and how can we design platforms and ecosystems more robust to these risks?

--

Contact SWS Office for Zoom link details
Read more

Intelligibility Throughout the Machine Learning Life Cycle

Jenn Wortman Vaughan Microsoft Research NYC
18 Nov 2020, 3:00 pm
Virtual talk
SWS Distinguished Lecture Series
People play a central role in the machine learning life cycle. Consequently, building machine learning systems that are reliable, trustworthy, and fair requires that relevant stakeholders—including developers, users, and the people affected by these systems—have at least a basic understanding of how they work. Yet what makes a system "intelligible" is difficult to pin down. Intelligibility is a fundamentally human-centered concept that lacks a one-size-fits-all solution. I will explore the importance of evaluating methods for achieving intelligibility in context with relevant stakeholders, …
People play a central role in the machine learning life cycle. Consequently, building machine learning systems that are reliable, trustworthy, and fair requires that relevant stakeholders—including developers, users, and the people affected by these systems—have at least a basic understanding of how they work. Yet what makes a system "intelligible" is difficult to pin down. Intelligibility is a fundamentally human-centered concept that lacks a one-size-fits-all solution. I will explore the importance of evaluating methods for achieving intelligibility in context with relevant stakeholders, ways of empirically testing whether intelligibility techniques achieve their goals, and why we should expand our concept of intelligibility beyond machine learning models to other aspects of machine learning systems, such as datasets and performance metrics.

--

Please contact Office for the Zoom details.
Read more

Experiments in Machine Behavior: Cooperating with and through machines

Iyad Rahwan Max Planck Institute for Human Development
11 Nov 2020, 3:00 pm
Virtual talk
SWS Distinguished Lecture Series
Human cooperation is fundamental to the success of our species. But emerging machine intelligence poses new challenges to human cooperation. This talk explores two interrelated problems: How can we humans cooperate *through* machines--that is, agree among ourselves on how machines should behave? And how can we cooperate *with* machines--that is, convince self-interested machines to cooperate with us. The talk will then propose an interdisciplinary agenda for understanding and improving our human-machine ecology.

--

Please contact SWS Office for Zoom details
Human cooperation is fundamental to the success of our species. But emerging machine intelligence poses new challenges to human cooperation. This talk explores two interrelated problems: How can we humans cooperate *through* machines--that is, agree among ourselves on how machines should behave? And how can we cooperate *with* machines--that is, convince self-interested machines to cooperate with us. The talk will then propose an interdisciplinary agenda for understanding and improving our human-machine ecology.

--

Please contact SWS Office for Zoom details

Simple models for optimizing driver earnings in ride-sharing platforms

Evimaria Terzi Boston University
04 Nov 2020, 4:00 pm - 05 Nov 2020, 1:00 am
Virtual talk
SWS Distinguished Lecture Series
On-demand ride-hailing platforms like Uber and Lyft are helping reshape urban transportation, by enabling car owners to become drivers for hire with minimal overhead. Such platforms are a multi-sided market and offer a rich space for studies with socio-economic implications. In this talk I am going to address two questions:

1. In the absence of coordination, what is the best course of action for a self-interested driver that wants to optimize his earnings?

2. …
On-demand ride-hailing platforms like Uber and Lyft are helping reshape urban transportation, by enabling car owners to become drivers for hire with minimal overhead. Such platforms are a multi-sided market and offer a rich space for studies with socio-economic implications. In this talk I am going to address two questions:

1. In the absence of coordination, what is the best course of action for a self-interested driver that wants to optimize his earnings?

2. In the presence of coordination, is it possible to maximize social welfare objectives in an environment where the objectives of the participants (drivers, customers and the platform) are (often) misaligned?

We will discuss the computational problems behind these problems and describe simple algorithmic solutions that work extremely well in practice. We will demonstrate the practical strength of our approaches with well-designed experiments on novel datasets we collected from such platforms. --- Please contact Office for the Zoom details.
Read more

Ensuring Compliance with Data Privacy and Usage Policies in Online Services

Aastha Mehta Max Planck Institute for Software Systems
03 Nov 2020, 2:30 pm - 3:30 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Thesis Defense
Online services collect and process a variety of sensitive personal data that is subject to complex privacy and usage policies. Complying with the policies is critical and often legally binding for service providers, but it is challenging as applications are prone to many disclosure threats. In this thesis, I present two compliance systems, Qapla and Pacer, that ensure efficient policy compliance in the face of direct and side-channel disclosures, respectively.

Qapla prevents direct disclosures in database-backed applications (e.g., …
Online services collect and process a variety of sensitive personal data that is subject to complex privacy and usage policies. Complying with the policies is critical and often legally binding for service providers, but it is challenging as applications are prone to many disclosure threats. In this thesis, I present two compliance systems, Qapla and Pacer, that ensure efficient policy compliance in the face of direct and side-channel disclosures, respectively.

Qapla prevents direct disclosures in database-backed applications (e.g., personnel management systems), which are subject to complex access control, data linking, and aggregation policies. Conventional methods inline policy checks with application code. Qapla instead specifies policies directly on the database and enforces them in a database adapter, thus separating compliance from the application code.

Pacer prevents network side-channel leaks in cloud applications. A tenant’s secrets may leak via its network traffic shape, which can be observed at shared network links (e.g., network cards, switches). Pacer implements a cloaked tunnel abstraction, which hides secret-dependent variation in tenant’s traffic shape, but allows variations based on non-secret information, enabling secure and efficient use of network resources in the cloud.

Both systems require modest development efforts, and incur moderate performance overheads, thus demonstrating their usability.
Read more

Diagnosing the data ecosystem

Katrina Ligett Hebrew University of Jerusalem
02 Nov 2020, 9:00 am - 10:00 am
Virtual talk
SWS Colloquium
In this talk, we'll look together at some of the problems of today's data ecosystem, including issues of privacy, control, fairness, surveillance, and manipulation. We'll explore the ways in which various recent technologies can provide (partial) solutions to some of these woes, and we'll confront some of the limitations of technology. The discussion will expose a rich array of interesting research problems that bridge between computer science, law, economics, and beyond. -- Please contact Office for the Zoom details.
In this talk, we'll look together at some of the problems of today's data ecosystem, including issues of privacy, control, fairness, surveillance, and manipulation. We'll explore the ways in which various recent technologies can provide (partial) solutions to some of these woes, and we'll confront some of the limitations of technology. The discussion will expose a rich array of interesting research problems that bridge between computer science, law, economics, and beyond. -- Please contact Office for the Zoom details.

Towards Ultra-Reliable CPS: Reliability Analysis of Distributed Real-Time Systems

Arpan Gujarati Max Planck Institute for Software Systems
22 Oct 2020, 3:30 pm - 4:30 pm
Kaiserslautern building E1 5, room 111
SWS Student Defense Talks - Thesis Defense
In the avionics domain, "ultra-reliability" refers to the practice of ensuring quantifiably negligible residual failure rates in the presence of transient and permanent hardware faults. If autonomous Cyber-Physical Systems (CPS) in other domains, e.g., autonomous vehicles, drones, and industrial automation systems, are to permeate our everyday life in the not so distant future, then they also need to become ultra-reliable. However, the rigorous reliability engineering and analysis practices used in the avionics domain are expensive and time consuming, …
In the avionics domain, "ultra-reliability" refers to the practice of ensuring quantifiably negligible residual failure rates in the presence of transient and permanent hardware faults. If autonomous Cyber-Physical Systems (CPS) in other domains, e.g., autonomous vehicles, drones, and industrial automation systems, are to permeate our everyday life in the not so distant future, then they also need to become ultra-reliable. However, the rigorous reliability engineering and analysis practices used in the avionics domain are expensive and time consuming, and cannot be transferred to most other CPS domains. The increasing adoption of faster and cheaper, but less reliable, Commercial Off-The-Shelf (COTS) hardware is also an impediment in this regard.

Motivated by the goal of ultra-reliable CPS, this dissertation shows how to soundly analyze the reliability of COTS-based implementations of actively replicated Networked Control Systems (NCSs)—which are key building blocks of modern CPS—in the presence of transient hardware faults. When an NCS is deployed over field buses such as the Controller Area Network (CAN), transient faults are known to cause host crashes, network retransmissions, and incorrect computations. In addition, when an NCS is deployed over point-to-point networks such as Ethernet, even Byzantine errors (i.e., inconsistent broadcast transmissions) are possible. The analyses proposed in this dissertation account for NCS failures due to each of these error categories, and consider NCS failures in both time and value domains. The analyses are also provably free of reliability anomalies. Such anomalies are problematic because they can result in unsound failure rate estimates, which might lead us to believe that a system is safer than it actually is.

Specifically, this dissertation makes four main contributions. (1) To reduce the failure rate of NCSs in the presence of Byzantine errors, we present a hard real-time design of a Byzantine Fault Tolerance (BFT) protocol for Ethernet-based systems. (2) We then propose a quantitative reliability analysis of the presented design in the presence of transient faults. (3) Next, we propose a similar analysis to upper-bound the failure probability of an actively replicated CAN-based NCS. (4) Finally, to upper-bound the long-term failure rate of the NCS more accurately, we propose analyses that take into account the temporal robustness properties of an NCS expressed as weakly-hard constraints.

By design, our analyses can be applied in the context of full-system analyses. For instance, to certify a system consisting of multiple actively replicated NCSs deployed over a BFT atomic broadcast layer, the upper bounds on the failure rates of each NCS and the atomic broadcast layer can be composed using the sum-of-failure-rates model.
Read more