Events 2012

Geographic Dissection of the Twitter Network

Juhi Kulshrestha Max Planck Institute for Software Systems
07 Dec 2012, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 005
SWS Student Defense Talks - Qualifying Exam
Geography plays an important role in shaping societal interactions in the offline world. However, as more and more social interactions occur online via social networking sites like Twitter and Facebook, users can interact with others unconstrained by their geolocations, raising the question: does offline geography still matter in online social networks? In this paper, we attempt to address this question by dissecting the Twitter social network based on users’ geolocations and investigating how users’ geolocation impacts their participation in Twitter, ...
Geography plays an important role in shaping societal interactions in the offline world. However, as more and more social interactions occur online via social networking sites like Twitter and Facebook, users can interact with others unconstrained by their geolocations, raising the question: does offline geography still matter in online social networks? In this paper, we attempt to address this question by dissecting the Twitter social network based on users’ geolocations and investigating how users’ geolocation impacts their participation in Twitter, including their connections to others and the information they exchange with them. Our in-depth analysis reveals that geography continues to have a significant impact on user interactions in the Twitter social network. The influence of geography could be potentially explained by the shared national, linguistic, and cultural backgrounds of users from the same geographic neighborhood.
Read more

Reasoning as a First-class Operating System Service

Timothy Roscoe ETH Zürich
07 Dec 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
In this talk, I'll argue for sophisticated automated reasoning capabilities as a first-class OS service. With such a service, one can delegate many OS policy decisions and calculations to a component which is highly flexible, expressive, and dynamic, providing considerable advantages of hard-coding such functionality in C or scripts. Modern operating systems face several engineering challenges: hardware is increasingly complex, increasingly diverse, and evolving rapidly. This, combined with parallel workloads having complex performance interactions with hardware make it hard to build a simple OS kernel which delivers good performance for a variety of platforms and workloads. ...
In this talk, I'll argue for sophisticated automated reasoning capabilities as a first-class OS service. With such a service, one can delegate many OS policy decisions and calculations to a component which is highly flexible, expressive, and dynamic, providing considerable advantages of hard-coding such functionality in C or scripts. Modern operating systems face several engineering challenges: hardware is increasingly complex, increasingly diverse, and evolving rapidly. This, combined with parallel workloads having complex performance interactions with hardware make it hard to build a simple OS kernel which delivers good performance for a variety of platforms and workloads.

As a first step, we decided to tackle this head-on by building a reasoning engine as a first class service (the "System Knowledge Base") in the Barrelfish OS, borrowing ideas from such fields as knowledge representation, constraint satisfaction, logic programming, and optimization. Doing so was not without problems, but we found it highly convenient in a number of widely different application areas - for example PCI programming, process coordination, spatial scheduling, and message routing. I'll discuss several of these, and how the structure of the OS as a whole changes when a facility like the SKB is available Finally, I'll talk about some future directions, in particularly with embedded devices such as SoCs.
Read more

Expositor: Scriptable Time-Travel Debugging with First Class Traces

Prof. Michael Hicks University of Maryland
30 Nov 2012, 10:30 am - 12:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
We present Expositor, a new debugging environment that combines scripting and time-travel debugging to allow developers to automate complex debugging tasks. The fundamental abstraction provided by Expositor is the execution trace, which is a time-indexed sequence of program state snapshots. Developers can manipulate traces as if they were simple lists with operations such as map and filter. Under the hood, Expositor efficiently implements traces as lazy, sparse interval trees, whose contents are materialized on demand. Expositor also provides a novel data structure, ...
We present Expositor, a new debugging environment that combines scripting and time-travel debugging to allow developers to automate complex debugging tasks. The fundamental abstraction provided by Expositor is the execution trace, which is a time-indexed sequence of program state snapshots. Developers can manipulate traces as if they were simple lists with operations such as map and filter. Under the hood, Expositor efficiently implements traces as lazy, sparse interval trees, whose contents are materialized on demand. Expositor also provides a novel data structure, the edit hash array mapped trie, which is a lazy implementation of sets, maps, multisets, and multimaps that enables developers to maximize the efficiency of their debugging scripts. We have used Expositor to debug a stack overflow and to unravel a subtle data race in Firefox. We believe that Expositor represents an important step forward in improving the technology for diagnosing complex, hard-to-understand bugs. This is joint work with Yit Phang Khoo and Jeff Foster, both at Maryland
Read more

System Designs for Securing Data and Computations against Administration Threats

Nuno Santos Max Planck Institute for Software Systems
19 Nov 2012, 6:00 pm - 7:00 pm
Saarbrücken building E1 5, room 005
SWS Student Defense Talks - Thesis Proposal
The modern computing platforms where users store and process security-sensitive data offer few or no defenses against administration threats. In server platforms such as corporate IT and cloud provider infrastructures, a curious or malicious system administrator could easily leak or corrupt the users’ data therein located. In mobile platforms (e.g., smartphones) attackers that manage to escalate administration privileges (e.g., after stealing the device) could get their hands on personal and enterprise data residing in the devices. ...
The modern computing platforms where users store and process security-sensitive data offer few or no defenses against administration threats. In server platforms such as corporate IT and cloud provider infrastructures, a curious or malicious system administrator could easily leak or corrupt the users’ data therein located. In mobile platforms (e.g., smartphones) attackers that manage to escalate administration privileges (e.g., after stealing the device) could get their hands on personal and enterprise data residing in the devices.

To protect user data and computations against administration threats, trusted computing systems can be an effective approach: trusted computing software shields user data and computations from the administrator, and trusted computing hardware checks the boot time integrity of the software binary to assure users that the software’s security properties hold. However, applying this technique to server and mobile platforms is challenging. Trusted computing hardware raises scalability and security concerns when applied to multi-node environments, such as cloud infrastructures. In addition, existing trusted computing software significantly limits administrators’ ability for maintaining the systems (e.g., installing packages and kernel modules), and offers application developers low level programming abstractions, which make application development cumbersome.

In this thesis, we present three systems aimed at addressing these challenges. First, we propose Excalibur, a system that provides support for building trusted cloud services by retrofitting commodity trusted computing hardware to cloud infrastructures; it exposes simple primitives that restrict access to user data by the provider’s cloud nodes based on a user-defined policy. Second, we present an OS design model, named broker security model, and a proof-of-concept OS—BrokULOS—that shows how to design a Linux-based OS that can provide information security while letting the administrator perform the management tasks required to keep the system operational. Third, we propose the Trusted Language Runtime (TLR), trusted computing software that exposes high-level abstractions for developing trusted applications for smartphones; the TLR depends on a small trusted computing base by taking advantage on ARM TrustZone technology and the .Net Microframework.
Read more

Beluga^mu: Programming proofs in context

Brigitte Pientka McGill University
14 Nov 2012, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
Software systems should be robust, reliable, and predictable. Today, we routinely reason about the runtime behavior of software using formal systems such as type systems or logics for access control or information flow to establish safety and liveness properties.

In this talk, I will survey Beluga, a dependently typed programming and proof environment. It supports specifying formal systems in the logical framework LF and directly supports common and tricky routines dealing with variables, such as capture-avoiding substitution and renaming. ...
Software systems should be robust, reliable, and predictable. Today, we routinely reason about the runtime behavior of software using formal systems such as type systems or logics for access control or information flow to establish safety and liveness properties.

In this talk, I will survey Beluga, a dependently typed programming and proof environment. It supports specifying formal systems in the logical framework LF and directly supports common and tricky routines dealing with variables, such as capture-avoiding substitution and renaming. Moreover, Beluga allows embedding LF objects together with their context in programs and types supporting inductive and coinductive definitions. I will discuss how to write inductive and coinductive proofs about LF specifications using three different examples: type uniqueness, normalization by evaluation, and the fact that evaluating a lambda-term cannot yield a value and diverge at the same time. Taken together these examples demonstrate the elegance and conciseness of Beluga for specifying, verifying and validating safety properties.
Read more

Corybantic: Towards the Modular Composition of SDN Controllers

Jeffrey Mogul HP Labs, Palo Alto
05 Nov 2012, 2:00 pm - 3:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Software-Defined Networking (SDN) promises to enable vigorous innovation, through separation of the control plane from the data plane, and to enable novel forms of network management, through a controller that uses a global view to make globally-valid decisions. The design of SDN controllers creates novel challenges; much previous work has focussed on making them scalable, reliable, and efficient.

We argue that, to control a realistic network, we do not want one monolithic SDN controller. ...
Software-Defined Networking (SDN) promises to enable vigorous innovation, through separation of the control plane from the data plane, and to enable novel forms of network management, through a controller that uses a global view to make globally-valid decisions. The design of SDN controllers creates novel challenges; much previous work has focussed on making them scalable, reliable, and efficient.

We argue that, to control a realistic network, we do not want one monolithic SDN controller. Instead, we want to compose the effects of many controller modules managing different aspects of the network, which may be competing for resources. Each module will try to optimize one or more objectives; we address the challenge of how to coordinate between these modules to optimize an overall objective function. Our framework design, Corybantic, focusses on achieving both modular decomposition and maximizing the overall value delivered by the controller's decisions.
Read more

Reasoning with MAD distributed systems

Lorenzo Alvisi University of Texas at Austin
31 Oct 2012, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Decentralized approaches spanning multiple administrative domains (MAD) are an increasingly effective way to deploy services. Popular examples include peer-to-peer (p2p) applications, content distribution networks, and mesh routing protocols. Cooperation lies at the heart of these services. Yet, when working together is crucial, a natural question is: "What if users stop cooperating?" After all, users in a MAD system are vulnerable not only to the failure of some of the equipment in the system, or to the possibility that some users may behave maliciously, ...
Decentralized approaches spanning multiple administrative domains (MAD) are an increasingly effective way to deploy services. Popular examples include peer-to-peer (p2p) applications, content distribution networks, and mesh routing protocols. Cooperation lies at the heart of these services. Yet, when working together is crucial, a natural question is: "What if users stop cooperating?" After all, users in a MAD system are vulnerable not only to the failure of some of the equipment in the system, or to the possibility that some users may behave maliciously, but also to the possibility that users may selfishly refrain from sharing their resources as the protocol would require. In this setting, it is hard to put a bound on the number of components in the systems that deviate from their correct specification. Is it still possible under such circumstances to build systems that not only provide provable guarantees in terms of their safety and liveness properties but also yield practical performance?
Read more

Provenance Verification

Zilong Wang Max Planck Institute for Software Systems
11 Oct 2012, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 204
SWS Student Defense Talks - Qualifying Exam
We study the problem of provenance tracking in concurrent programs. The provenance verification problem is to statically decide, given a message passing program and a set of allowed provenances, whether the provenance of all messages in all possible program executions, belongs to the allowed set. We formalize the provenance verification problem and show a general decidability result for it.

Automated Malware Analysis

Dr. Christopher Kruegel University of California, Santa Barbara
01 Oct 2012, 11:00 am - 12:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Malicious software (malware) is an important threat and root cause of many security problems on the Internet. In this talk, I will discuss our recent efforts on malware analysis, detection, and mitigation. First, I will introduce our infrastructure to collect and analyze malicious code samples. Then, I will present techniques to improve the quality of the results produced by automated, dynamic malware analysis systems. Finally, I will discuss ways in which these results can be leveraged for the detection and mitigation of malicious code. ...
Malicious software (malware) is an important threat and root cause of many security problems on the Internet. In this talk, I will discuss our recent efforts on malware analysis, detection, and mitigation. First, I will introduce our infrastructure to collect and analyze malicious code samples. Then, I will present techniques to improve the quality of the results produced by automated, dynamic malware analysis systems. Finally, I will discuss ways in which these results can be leveraged for the detection and mitigation of malicious code.
Read more

Planet Dynamic or: How I Learned to Stop Worrying and Love Reflection

Prof. Jan Vitek Purdue University
25 Sep 2012, 3:00 pm - 4:30 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 029
SWS Distinguished Lecture Series
A fundamental belief underlying forty years of programming languages research, aptly captured by the slogan "Well-typed programs can't go wrong", is that programs augmented with machine-checked annotations are more likely to be free of bugs. But of course, real programs do wrong and programmers are voting with their feet. Dynamic languages such as Ruby, Python, Lua, JavaScript and R are unencumbered by redundant type annotations and are increasingly popular. JavaScript, the lingua franca of the web, ...
A fundamental belief underlying forty years of programming languages research, aptly captured by the slogan "Well-typed programs can't go wrong", is that programs augmented with machine-checked annotations are more likely to be free of bugs. But of course, real programs do wrong and programmers are voting with their feet. Dynamic languages such as Ruby, Python, Lua, JavaScript and R are unencumbered by redundant type annotations and are increasingly popular. JavaScript, the lingua franca of the web, is moving to the server with the success of Node.js. R, another dynamic language, is being used in statistics, biology and finance for data analysis and visualization. Not only are these languages devoid of types, but they utterly lack any static structure that could be used for program verification. This talk will draw examples from recent results on JavaScript and R to illustrate the extent of the problem and propose some directions for research.
Read more

Towards Trustworthy Embedded Systems

Gernot Heiser NICTA, Kensington, Australia
24 Sep 2012, 1:00 pm - 2:30 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 029
SWS Distinguished Lecture Series
Embedded systems are increasingly used in circumstances where people's lives or valuable assets are at stake, hence they should be trustworthy - safe, secure, reliable. True trustworthiness can only be achieved through mathematical proof of the relevant properties. Yet, real-world software systems are far too complex to make their formal verification tractable in the foreseeable future. The Trustworthy Systems project at NICTA has formally proved the functional correctness as well as other security-relevant properties of the seL4 microkernel. ...
Embedded systems are increasingly used in circumstances where people's lives or valuable assets are at stake, hence they should be trustworthy - safe, secure, reliable. True trustworthiness can only be achieved through mathematical proof of the relevant properties. Yet, real-world software systems are far too complex to make their formal verification tractable in the foreseeable future. The Trustworthy Systems project at NICTA has formally proved the functional correctness as well as other security-relevant properties of the seL4 microkernel. This talk will provide an overview of the principles underlying seL4, and the approach taken in its design, implementation and formal verification. It will also discuss on-going activities and our strategy for achieving the ultimate goal of system-wide security guarantees.
Read more

Opportunistic Mobile Social Networks at Work

Anna-Kaisa Pietilainen Technicolor Paris Research Laboratory
06 Jul 2012, 1:30 pm - 2:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
Opportunistic networks exploit human mobility and consequent device-to-device ad hoc contacts to disseminate content in a "store-carry-forward" fashion. In opportunistic networks, disconnections and highly variable delays caused by human mobility are the norm. Another major challenge in opportunistic communications arises from the small form factor of mobile devices which introduces resource limitations compared to static computing systems. Lastly, human mobility and social interactions have a large impact on the structure and performance of opportunistic networks, hence, ...
Opportunistic networks exploit human mobility and consequent device-to-device ad hoc contacts to disseminate content in a "store-carry-forward" fashion. In opportunistic networks, disconnections and highly variable delays caused by human mobility are the norm. Another major challenge in opportunistic communications arises from the small form factor of mobile devices which introduces resource limitations compared to static computing systems. Lastly, human mobility and social interactions have a large impact on the structure and performance of opportunistic networks, hence, understanding these phenomena is crucial for the design of efficient algorithms and applications.

In this work, we take an experimental approach to better understand opportunistic mobile social networks. We design and implement of MobiClique, a communication middleware for opportunistic mobile social networking. MobiClique takes advantage of user mobility and social relationships to forward messages in an opportunistic manner. We perform a large-scale MobiClique experiment with 76 people, where we collect social network information (i.e. their Facebook profiles), and ad hoc contact and communication traces. We use the collected data with three other data sets to analyse in detail the time-varying structure and epidemic content dissemination in opportunistic networks. Most of the related works have focused on the pairwise contact history among users in conference or campus environments. We claim that given the density of these networks, this approach leads to a biased understanding of the content dissemination process. We design a methodology to break the contact traces down into "temporal communities", i.e., groups of people who meet periodically during an experiment. We show that these communities correlate with people's social communities. As in previous works, we observe that efficient content dissemination is mostly due to high contact rate users. However, we show that high contact rate users that are more frequently involved in temporal communities contribute less to the dissemination process, leading us to conjecture that social communities tend to limit the efficiency of content dissemination in opportunistic mobile social networks.
Read more

Provably-Secure Cryptographic Protocols: from Practice to Theory to Practice

Dario Fiore New York University
05 Jul 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 029
SWS Colloquium
Digital signatures can be seen as the digital equivalent of handwritten signatures, and are considered one of the most important cryptographic primitives. At a high level, they allow a user Alice to authenticate a digital document by generating a piece of information, that is the signature, using a secret key which is known only by her. Any other user who gets a matching public verification key can check the validity of such signature and thus be convinced that it was generated by Alice. ...
Digital signatures can be seen as the digital equivalent of handwritten signatures, and are considered one of the most important cryptographic primitives. At a high level, they allow a user Alice to authenticate a digital document by generating a piece of information, that is the signature, using a secret key which is known only by her. Any other user who gets a matching public verification key can check the validity of such signature and thus be convinced that it was generated by Alice. Digital signatures are required to satisfy the most natural security property one could expect: no one, except who knows the secret key, should be able to generate valid signatures. In the quest of mimicking in the digital world what we are used to do in the real world, an interesting question naturally arises: can Alice delegate the signing process (on a restricted set of messages) to third parties without having to reveal to them her secret key? A positive answer to this question has been recently given by Boneh et al. (PKC 2009) by means of homomorphic signatures.

In the first part of my talk, I will present the notion of homomorphic signatures: I will describe important applications which motivate the study of this primitive, and I will survey recent results of mine that propose efficient constructions.

In the second part of the talk I will move the focus to a related, but more general and intriguing question: can we sign computation? Are there means to certify that a program has been run correctly and/or on the correct inputs? These and similar questions are nowadays arising in the context of cloud computing applications in which users want to delegate computation and memory to third parties that are called cloud providers. I will describe relevant security issues emerging from these applications, and will discuss how cryptography can help to solve such problems.

During my presentation I will also mention the usual approaches underlying the process of designing cryptographic protocols, with a particular emphasis on how theory and practice can interact in a significant way.

The talk is mainly based on the following works joint with Dario Catalano, Rosario Gennaro and Bogdan Warinschi. - D. Catalano, D. Fiore and B. Warinschi. Adaptive Pseudo-Free Groups and Applications. EUROCRYPT 2011 - D. Catalano, D. Fiore and B. Warinschi. Efficient Network Coding Signatures in the Standard Model. PKC 2012 - D. Fiore and R. Gennaro. Publicly Verifiable Delegation of Large Polynomials and Matrix Computations, with Applications. Pre-print (May 2012): http://eprint.iacr.org/2012/281
Read more

Supporting Nested Locking in Multiprocessor Real-Time Systems

Bryan Ward University of North Carolina, Chapel Hill
03 Jul 2012, 11:00 am - 12:30 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
This paper presents the first real-time multiprocessor locking protocol that supports fine-grained nested resource requests. This locking protocol relies on a novel technique for ordering the satisfaction of resource requests to ensure a bounded duration of priority inversions for nested requests. This technique can be applied on partitioned, clustered, and globally scheduled systems in which waiting is realized by either spinning or suspending. Furthermore, this technique can be used to construct fine-grained nested locking protocols that are efficient under spin-based, ...
This paper presents the first real-time multiprocessor locking protocol that supports fine-grained nested resource requests. This locking protocol relies on a novel technique for ordering the satisfaction of resource requests to ensure a bounded duration of priority inversions for nested requests. This technique can be applied on partitioned, clustered, and globally scheduled systems in which waiting is realized by either spinning or suspending. Furthermore, this technique can be used to construct fine-grained nested locking protocols that are efficient under spin-based, suspension-oblivious or suspension-aware analysis of priority inversions. Locking protocols built upon this technique perform no worse than coarse-grained locking mechanisms, while allowing for increased parallelism in the average case (and, depending upon the task set, better worst-case performance).
Read more

Non-tracking Web Analytics

Istemi Ekin Akkus Max Planck Institute for Software Systems
22 Jun 2012, 2:00 pm - 3:00 pm
Kaiserslautern building G26, room 204
SWS Student Defense Talks - Qualifying Exam
Today, websites commonly use third party web analytics services to obtain aggregate information about users that visit their sites. This information includes demographics and visits to other sites as well as user behavior within their own sites. Unfortunately, to obtain this aggregate information, web analytics services track individual user browsing behavior across the web. This violation of user privacy has been strongly criticized, resulting in tools that block such tracking as well as anti-tracking legislation and standards such as Do-Not-Track. ...
Today, websites commonly use third party web analytics services to obtain aggregate information about users that visit their sites. This information includes demographics and visits to other sites as well as user behavior within their own sites. Unfortunately, to obtain this aggregate information, web analytics services track individual user browsing behavior across the web. This violation of user privacy has been strongly criticized, resulting in tools that block such tracking as well as anti-tracking legislation and standards such as Do-Not-Track. These efforts, while improving user privacy, degrade the quality of web analytics. This paper presents the first design of a system that provides web analytics without tracking. The system gives users differential privacy guarantees, can provide better quality analytics than current services, requires no new organizational players, and is practical to deploy. This paper describes and analyzes the design, gives performance benchmarks, and presents our implementation and deployment across several hundred users.
Read more

Provenance for Database Transformations

Val Tannen University of Pennsylvania and EPFL
21 Jun 2012, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Database transformations (queries, views, mappings) take apart, filter,and recombine source data in order to populate warehouses, materialize views,and provide inputs to analysis tools. As they do so, applications often need to track the relationship between parts and pieces of the sources and parts and pieces of the transformations' output. This relationship is what we call database provenance.

This talk presents an approach to database provenance that relies on two observations. First, provenance is a kind of annotation, ...
Database transformations (queries, views, mappings) take apart, filter,and recombine source data in order to populate warehouses, materialize views,and provide inputs to analysis tools. As they do so, applications often need to track the relationship between parts and pieces of the sources and parts and pieces of the transformations' output. This relationship is what we call database provenance.

This talk presents an approach to database provenance that relies on two observations. First, provenance is a kind of annotation, and we can develop a general approach to annotation propagation that also covers other applications, for example to uncertainty and access control. In fact, provenance turns out to be the most general kind of such annotation,in a precise and practically useful sense. Second, the propagation of annotation through a broad class of transformations relies on just two operations: one when annotations are jointly used and one when they are used alternatively.This leads to annotations forming a specific algebraic structure, a commutative semiring.

The semiring approach works for annotating tuples, field values and attributes in standard relations, in nested relations (complex values), and for annotating nodes in(unordered) XML. It works for transformations expressed in the positive fragment of relational algebra, nested relational calculus, unordered XQuery, as well as for Datalog, GLAV schema mappings, and tgd constraints. Finally, when properly extended to semimodules it works for queries with aggregates. Specific semirings correspond to earlier approaches to provenance, while others correspond to forms of uncertainty, trust, cost, and access control.

This is joint work with Y. Amsterdamer, D. Deutch, J.N. Foster, T.J. Green, Z. Ives, and G. Karvounarakis, done in part within the frameworks of the Orchestra and pPOD projects.
Read more

How Mobile Disrupts and Opens Up Social as We Know It

Monica Lam Stanford University
01 Jun 2012, 10:30 am - 11:30 am
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Being a personal device that is constantly online, the mobile device will revolutionize social computing. We will be able to share multimedia and interact freely with any of our different circles of friends, colleagues, and families, without being confined to closed proprietary social networks and without having our communication intermediated by third-parties. We believe this openness will lead to an explosion of social apps in new categories including health, finance, and corporate.

We have started the revolution with the first of such a system called Musubi. ...
Being a personal device that is constantly online, the mobile device will revolutionize social computing. We will be able to share multimedia and interact freely with any of our different circles of friends, colleagues, and families, without being confined to closed proprietary social networks and without having our communication intermediated by third-parties. We believe this openness will lead to an explosion of social apps in new categories including health, finance, and corporate.

We have started the revolution with the first of such a system called Musubi. Musubi is a real-time, multimedia group chat app that allows users to control all their data. It is also an open app platform that lets apps be easily created and spread virally among friends. Musubi is built on top of an egocentric social platform where identity-based cryptography is used to allow individuals communicate securely with each other without the friction of key exchange. It provides an innovative identity firewall that supports social apps without users giving up friends' data to the app creators.

Musubi Beta and a number of social apps are already available in the Google Play Store. We invite you to give us feedback on the apps and to join us in creating an open mobile social internet.
Read more

Mr. Ram Sewak Sharma (Unique Identification Authority of India (UIDAI)) talks about "Design for UID Numbers to all Indian residents"

30 May 2012, 11:30 am - 12:15 pm
Saarbrücken building E1 5, room 029
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
Unique Identification Authority of India (UIDAI) has been setup by the Govt. of India with a mandate to issue a unique identification number to all the residents in the country. UIDAI proposes to create a platform to first collect the identity details and then to perform authentication that can be used by several government and commercial services providers. A key requirement of the UID system is to minimize/eliminate duplicate identity in order to improve the efficacy of the service delivery. ...
Unique Identification Authority of India (UIDAI) has been setup by the Govt. of India with a mandate to issue a unique identification number to all the residents in the country. UIDAI proposes to create a platform to first collect the identity details and then to perform authentication that can be used by several government and commercial services providers. A key requirement of the UID system is to minimize/eliminate duplicate identity in order to improve the efficacy of the service delivery. UIDAI has selected biometrics feature set as the primary method to check for duplicate identity. In order to ensure that an individual is uniquely identified in an easy and cost-effective manner, it is necessary to ensure that the captured biometric information is capable of carrying out the de-duplication at the time of collection of information. For government and commercial providers to authenticate the identity at the time of service delivery, it is necessary that the biometric information capture and transmission are standardized across all the partners and users of the UID system.

UID provides, online, cost-effective, ubiquitous authentication services across the country and establishes identity. Authentication is transactional and to be done at the time of availing a benefit.

The objective of the UIDAI is to provide a robust, reusable ID to those who do not have any ID proof, cleaning up existing databases of multiple and fake entries through Uniqueness and to improve targeting and delivery of services. It will also reduce the cost of delivery of services.
Read more

A typed package language for Haskell

Scott Kilpatrick Max Planck Institute for Software Systems
23 May 2012, 4:00 pm - 5:00 pm
Saarbrücken building E1 5, room 029
SWS Student Defense Talks - Qualifying Exam
Package management systems have traditionally served as the media through which (open source) software is distributed and installed. Each package consists of a collection of program source files, organized through name- and version-based dependencies on one another.  Though ubiquitous in practice, these systems have largely evaded formal treatment as typed programming languages, unlike the rich work on "typed module systems" like those of the ML family of languages.

In this talk I will present a design for a typed language of packages for the Haskell programming language, ...
Package management systems have traditionally served as the media through which (open source) software is distributed and installed. Each package consists of a collection of program source files, organized through name- and version-based dependencies on one another.  Though ubiquitous in practice, these systems have largely evaded formal treatment as typed programming languages, unlike the rich work on "typed module systems" like those of the ML family of languages.

In this talk I will present a design for a typed language of packages for the Haskell programming language, based on the state-of-the-art module system design of "mixins."  Through this language design--in particular, through the addition of typed *interfaces*--packages may be correctly abstracted over their dependencies, which directly results in greater flexibility and reuse, as well as the ability for package authors to mechanically verify the absence of installation-time errors.  A clean elaboration from package expressions to bundles of ordinary source files shows how this language adds functionality while maintaining compatibility with existing compilers and tools.  Although this work takes Haskell as its underlying source language, its design could readily be generalized to define packages for other underlying languages.
Read more

Protecting Data Integrity with Storage Leases

Anjo Vahldiek Max Planck Institute for Software Systems
23 May 2012, 10:00 am - 11:00 am
Saarbrücken building E1 5, room 005
SWS Student Defense Talks - Qualifying Exam
We present a storage primitive called a storage lease. Data stored under a lease cannot be written for a pre-determined period. During the lease period, online data is protected from corruption due to security breaches, software errors, or accidental data deletion. Storage leases fill an important gap in the spectrum of data protection options because they combine strong integrity for online data with the ability to eventually reclaim storage. We define the storage lease primitive, show how it can be implemented in storage device firmware, ...
We present a storage primitive called a storage lease. Data stored under a lease cannot be written for a pre-determined period. During the lease period, online data is protected from corruption due to security breaches, software errors, or accidental data deletion. Storage leases fill an important gap in the spectrum of data protection options because they combine strong integrity for online data with the ability to eventually reclaim storage. We define the storage lease primitive, show how it can be implemented in storage device firmware, and discuss its applications. An experimental evaluation indicates that storage leases have a modest performance cost for most workloads on magnetic disks. Using a small amount of flash memory, this overhead can be reduced to near zero.
Read more

Scalable Machine Learning for the User

Alexander J. Smola Yahoo! Research & UC Berkeley & ANU
14 May 2012, 11:00 am - 12:30 pm
Saarbrücken building E1 4, room 019
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Scalable content personalization and profiling is a key tool for the internet. In this talk I will illustrate based on three problems how this can be achieved. More specifically I will show how hashing can be used to deal with compactly representing enormous amounts of parameters, how distributed latent variable inference can be used for user profiling, and how session modeling provides an attractive alternative to ranking.

Credit Networks: Liquidity and Formation

Pranav Dandekar Stanford University
23 Apr 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
Credit networks represent a way of modeling trust between entities in a network. Nodes in the network print their own currency and trust each other for a certain amount of each other's currency. This allows the network to serve as a decentralized payment infrastructure---arbitrary payments can be routed through the network by passing IOUs along a chain of trusting nodes in their respective currencies---and obviates the need for a common currency. Thus, credit networks are a decentralized approach based on trust to enable interactions between untrusting individuals in internet based networks and markets. ...
Credit networks represent a way of modeling trust between entities in a network. Nodes in the network print their own currency and trust each other for a certain amount of each other's currency. This allows the network to serve as a decentralized payment infrastructure---arbitrary payments can be routed through the network by passing IOUs along a chain of trusting nodes in their respective currencies---and obviates the need for a common currency. Thus, credit networks are a decentralized approach based on trust to enable interactions between untrusting individuals in internet based networks and markets.

We will first analyze the liquidity, i.e. the ability to route transactions, in credit networks in terms of the long term failure probability of transactions for various network topologies and credit values. We will show that under symmetric transaction rates, the transaction failure probability in a number of credit network topologies is comparable to that in equivalent centralized currency systems; thus we do not lose much liquidity in return for their robustness and decentralized properties. This is based on joint work with Ashish Goel, Ramesh Govindan and Ian Post which appeared in EC'11.

Next we will analyze the formation of credit networks when agents strategically decide how much credit to extend each other under different models of risk. When each agent trusts a fixed set of other agents, and transacts directly only with those it trusts, the formation game is a potential game and all Nash equilibria are social optima. Moreover, the Nash equilibria of this game are equivalent in a very strong sense: the sequences of transactions that can be supported from each equilibrium credit network are identical. However, when we allow transactions over longer paths, the game may not admit a Nash equilibrium, and the price of anarchy is unbounded. When agents have a shared belief about the trustworthiness of each agent, the networks formed in equilibrium have a star-like structure. Though the price of anarchy is unbounded, myopic best response quickly converges to a social optimum. This is based on joint work with Ashish Goel, Michael Wellman and Bryce Wiedenbeck to be presented at WWW'12.
Read more

Energy Debugging in Smartphones

Charlie Hu Purdue University
16 Apr 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
Despite the incredible market penetration of smartphones and exponential growth of the app market, utility of smartphones has been and will remain severely limited by the battery life. As such, energy has increasingly become the scarcest resource on smartphones that critically affects user experience. In this talk, I will start with a first study that characterizes smartphone energy bugs, or ebugs, broadly defined as errors in the system (apps, OS, hardware, firmware, or external conditions) that result in unexpected smartphone battery drainage and leads to significant user frustrations. ...
Despite the incredible market penetration of smartphones and exponential growth of the app market, utility of smartphones has been and will remain severely limited by the battery life. As such, energy has increasingly become the scarcest resource on smartphones that critically affects user experience. In this talk, I will start with a first study that characterizes smartphone energy bugs, or ebugs, broadly defined as errors in the system (apps, OS, hardware, firmware, or external conditions) that result in unexpected smartphone battery drainage and leads to significant user frustrations.

As a first step towards taming ebugs, we built the first fine-grained energy profiler, eprof, that performs energy accounting and hence answers the very question "where was the energy spent in the app?" at the per-routine, per-thread, and per-process granularity. Building eprof in turn requires a fine-grained, online power model which we have developed that captures the unique asynchronous power behavior of modern smartphones. Using eprof, we dissected the energy drain of some of the most popular apps in Android Market and discovered ebugs in popular apps like Facebook.

While essential, eprof only provides a semi-automatic tool for energy debugging. The "holy grail" in energy debugging in smartphones is to develop fully automatic debugging techniques and tools, which can draw synergies from many areas of computer science including OS, PL, compilers, machine learning, HCI, etc. I will present the first automatic ebug detection technique based on static compiler analysis for detecting "no-sleep" energy bugs, the most notorious category of energy bugs found in smartphone apps.
Read more

SCION: Scalability, Control, and Isolation On Next-Generation Networks

Adrian Perrig Carnegie Mellon University
28 Mar 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
We present the first Internet architecture designed to provide route control, failure isolation, and explicit trust information for end-to-end communications. SCION separates ASes into groups of independent routing sub-planes, called trust domains, which then interconnect to form complete routes. Trust domains provide natural isolation of routing failures and human misconfiguration, give endpoints strong control for both inbound and outbound traffic, provide meaningful and enforceable trust, and enable scalable routing updates with high path freshness. As a result, ...
We present the first Internet architecture designed to provide route control, failure isolation, and explicit trust information for end-to-end communications. SCION separates ASes into groups of independent routing sub-planes, called trust domains, which then interconnect to form complete routes. Trust domains provide natural isolation of routing failures and human misconfiguration, give endpoints strong control for both inbound and outbound traffic, provide meaningful and enforceable trust, and enable scalable routing updates with high path freshness. As a result, our architecture provides strong resilience and security properties as an intrinsic consequence of good design principles, avoiding piecemeal add-on protocols as security patches. Meanwhile, SCION only assumes that a few top-tier ISPs in the trust domain are trusted for providing reliable end-to-end communications, thus achieving a small Trusted Computing Base. Both our security analysis and evaluation results show that SCION naturally prevents numerous attacks and provides a high level of resilience, scalability, control, and isolation.
Read more

Forecast: Cloudy with a Chance of Consistency

Tim Kraska UC Berkeley
26 Mar 2012, 10:30 am - 11:30 am
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room Wartburg, 5th floor
SWS Colloquium
Cloud computing promises virtually unlimited scalability and high availability at low cost. However, it is commonly believed that a system's consistency must be relaxed in order to achieve these properties. We evaluated existing commercial cloud offerings for transactional workloads and found that consistency is indeed expensive and limits scalability in these systems.  Because of these costs, many systems designers have chosen to provide only relaxed consistency guarantees, if any, making such systems inappropriate for many mission-critical applications. ...
Cloud computing promises virtually unlimited scalability and high availability at low cost. However, it is commonly believed that a system's consistency must be relaxed in order to achieve these properties. We evaluated existing commercial cloud offerings for transactional workloads and found that consistency is indeed expensive and limits scalability in these systems.  Because of these costs, many systems designers have chosen to provide only relaxed consistency guarantees, if any, making such systems inappropriate for many mission-critical applications. This dichotomy is based on unrealistically pessimistic assumptions about Big Data environments. First, it assumes that consistency is an all or nothing decision that must be applied uniformly to all data in a system. Secondly, even in situations where strong consistency is required, previous transaction commit protocols were based on worst-case assumptions regarding the likelihood of conflicts.  In this talk, I will describe two techniques that build on a more nuanced view of consistency requirements and the costs of maintaining them.

I will first describe Consistency Rationing, which builds on inventory holding models used in Operations Research to help classify and manage data based on their consistency requirements. Consistency Rationing exploits the fact that for some data the cost of maintaining consistency outweighs the benefit obtained by avoiding inconsistencies. In the second part of the talk, I will present a new optimistic commit protocol for the wide-area network. For a long time, synchronized wide-area replication was considered to be infeasible with strong consistency. With MDCC, I will show how we can achieve strong consistency with similar response-time guarantees as eventual consistency in the normal operational case. This work was done as part of a larger project around Big Data management. At the end of the talk, I will provide an overview of some of my other projects and give an outline for future work.
Read more

Turning Sequential into Concurrent

Petr Kuznetsov TU Berlin/Deutsche Telekom Laboratories
22 Mar 2012, 3:00 pm - 4:00 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
It seems to be generally accepted that designing correct and efficient concurrent software is a sophisticated task that can only be held by experts. A crucial challenge then is to convert sequential code produced by a ``mainstream'' programmer into concurrent one. Various synchronization techniques may be used for this, e.g., locks or transactional memory, but what does it mean for the resulting concurrent implementation to be correct? And which synchronization primitives provide more efficiency at the end? ...
It seems to be generally accepted that designing correct and efficient concurrent software is a sophisticated task that can only be held by experts. A crucial challenge then is to convert sequential code produced by a ``mainstream'' programmer into concurrent one. Various synchronization techniques may be used for this, e.g., locks or transactional memory, but what does it mean for the resulting concurrent implementation to be correct? And which synchronization primitives provide more efficiency at the end?

We introduce a correctness criterion for a transformation that enables the use of a sequential data structure in a concurrent system. We then evaluate the performance of resulting concurrent implementations in terms of the sets of concurrent schedules (interleavings of steps of the sequential code) they accept. Intuitively, this captures the amount of concurrency that a given implementation can stand. This allows us to analyze relative power of seemingly different synchronization techniques, such as various forms of locking and transactional memory
Read more

Language as Influence(d)

Cristian Danescu-Niculescu-Mizil Cornell University
19 Mar 2012, 10:30 am - 12:00 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 5th Floor
SWS Colloquium
What effect does language have on people, and what effect do people have on language? The answers to these questions can help shape the future of social-media systems by bringing a new understanding of communication and collaboration between users.

I will describe two of my efforts to address these fundamental problems computationally, exploiting very large-scale textual and social data. The first project uncovers previously unexamined contextual biases that people have when determining which opinions to focus on, ...
What effect does language have on people, and what effect do people have on language? The answers to these questions can help shape the future of social-media systems by bringing a new understanding of communication and collaboration between users.

I will describe two of my efforts to address these fundamental problems computationally, exploiting very large-scale textual and social data. The first project uncovers previously unexamined contextual biases that people have when determining which opinions to focus on, using Amazon.com helpfulness votes on reviews as a case study to evaluate competing theories from sociology and social psychology. The second project leverages insights from psycho- and socio-linguistics and embeds them into a novel computational framework in order to provide a new understanding of how key aspects of social relations between individuals are embedded in (and can be inferred from) their conversational behavior. In particular, I will discuss how power differentials between interlocutors are subtly revealed by how much one individual immediately echoes the linguistic style of the person they are responding to.

This talk includes joint work with Susan Dumais, Michael Gamon, Jon Kleinberg, Gueorgi Kossinets, Lillian Lee and Bo Pang.
Read more

Information Discovery in Large Complex Datasets

Julia Stoyanovich University of Pennsylvania
15 Mar 2012, 10:30 am - 11:30 am
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 5th floor
SWS Colloquium
The focus of my research is on enabling novel kinds of interaction between the user and the information in a variety of digital environments, ranging from social content sites, to digital libraries, to the Web. In this talk, I will give an overview of my research, and will then present two recent lines of work that focus on information discovery in two important application domains.

In the first part of this talk, I will present an approach for tracking and querying fine-grained provenance in data-intensive workflows. ...
The focus of my research is on enabling novel kinds of interaction between the user and the information in a variety of digital environments, ranging from social content sites, to digital libraries, to the Web. In this talk, I will give an overview of my research, and will then present two recent lines of work that focus on information discovery in two important application domains.

In the first part of this talk, I will present an approach for tracking and querying fine-grained provenance in data-intensive workflows. A workflow is an encoding of a sequence of steps that progressively transform data products. Workflows help make experiments reproducible, and may be used to answer questions about data provenance -- the dependencies between input, intermediate, and output data. I will describe a declarative framework that captures fine-grained dependencies, enabling novel kinds of analytic queries, and will demonstrate that careful design and leveraging distributed processing make tracking and querying fine-grained provenance feasible.

In the second part of this talk, I will discuss personalized search and ranking on the Social Web. Social Web users provide information about themselves in stored profiles, register their relationships with other users, and express their preferences with respect to information and products. I will argue that information discovery should account for a user's social context, and will present network-aware search – a novel search paradigm in which result relevance is computed with respect to a user's social network. I will describe efficient algorithms appropriate for this setting, and will show how social similarities between users may be leveraged to make processing more efficient.
Read more

Why and How: A Reverse Perspective on Data Management

Alexandra Meliou University of Washington
27 Feb 2012, 10:30 am - 12:00 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 5th floor
SWS Colloquium
Current trends have seen data grow larger, more intertwined, and more diverse, as more and more users contribute to and use it. This trend has given rise to the need to support richer data analysis tasks. Such tasks involve determining the causes of observations, finding and correcting the sources of error in query results, as well as modifying the data in order to make it conform to complex desirable properties.

In this talk I will discuss three challenges: (a) providing explanations through support for causal queries ("Why"), ...
Current trends have seen data grow larger, more intertwined, and more diverse, as more and more users contribute to and use it. This trend has given rise to the need to support richer data analysis tasks. Such tasks involve determining the causes of observations, finding and correcting the sources of error in query results, as well as modifying the data in order to make it conform to complex desirable properties.

In this talk I will discuss three challenges: (a) providing explanations through support for causal queries ("Why"), (b) tracing and correcting errors at their source (post-factum data cleaning), and (c) integrating database systems with constrained optimization capabilities ("How"). First, I will show how to apply causal reasoning to tuple provenance in order to determine the causes of query results, and their responsibility. I will present extensive analysis of the data complexity for the case of conjunctive queries, and focus on a complete dichotomy between NP-hard and PTIME cases for the problem of computing responsibility. This concrete characterization of PTIME cases is crucial in scaling up to the challenges of Big Data. Second, I will demonstrate the applicability of the causality framework in a practical setting. I will use a mobile sensing application to show that ranking provenance tuples by their degrees of responsibility identifies errors more effectively than other schemes. Finally, I will present the Tiresias system, the first how-to query engine, which seamlessly integrates database systems with constrained problem solving capabilities. The contributions of the system are threefold: (a) a declarative interface for defining how-to queries over a database, (b) translation rules from the declarative statements to the constrained problem specification, and (c) a suite of data-specific optimizations that allow scaling to large data sizes. Initial results of our prototype system implementation show order-of-magnitude speedups to state-of-the-art solver runtimes, which indicates that there are significant gains in pushing this functionality within the database engine. I will conclude with a summary of my contributions, and discuss my future steps with the Tiresias system, and the bigger vision of reverse data management.
Read more

No Free Lunch in Data Privacy

Ashwin Machanavajjhala Yahoo! Santa Clara, CA
20 Feb 2012, 10:30 am - 12:00 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room 5th floor
SWS Colloquium
Tremendous amounts of personal data about individuals are being collected and shared online. Legal requirements and an increase in public awareness due to egregious breaches of individual privacy have made data privacy an important field of research. Recent research, culminating in the development of a powerful notion called differential privacy, have transformed this field from a black art into a rigorous mathematical discipline. 

This talk critically analyzes the trade-off between accuracy and privacy in the context of social advertising – recommending people, ...
Tremendous amounts of personal data about individuals are being collected and shared online. Legal requirements and an increase in public awareness due to egregious breaches of individual privacy have made data privacy an important field of research. Recent research, culminating in the development of a powerful notion called differential privacy, have transformed this field from a black art into a rigorous mathematical discipline. 

This talk critically analyzes the trade-off between accuracy and privacy in the context of social advertising – recommending people, products or services to users based on their social neighborhood. I will present a theoretical upper bound on the accuracy of performing recommendations that are solely based on a user's social network, for a given level of (differential) privacy of sensitive links in the social graph. I will show using real networks that good private social recommendations are feasible only for a small subset of the users in the social network or for a lenient setting of privacy parameters.

I will also describe some exciting new research about a no free lunch theorem, which argues that privacy tools (including differential privacy) cannot simultaneously guarantee utility as well as privacy for all types of data, and conclude with directions for future research in data privacy and big-data management.
Read more

Robust replication

Allen Clement Max Planck Institute for Software Systems
13 Feb 2012, 10:30 am - 12:00 pm
Kaiserslautern building G26, room 206
simultaneous videocast to Saarbrücken building E1 5, room Wartburg 5th floor
SWS Colloquium
The choice between Byzantine and crash fault tolerance is viewed as a fundamental design decision when building fault tolerant systems. We show that this dichotomy is not fundamental, and present a unified model of fault tolerance in which the number of tolerated faults of each type is a configuration choice. Additionally, we observe that a single fault is capable of devastating the performance of existing Byzantine fault tolerant replication systems. We argue that fault tolerant systems should, ...
The choice between Byzantine and crash fault tolerance is viewed as a fundamental design decision when building fault tolerant systems. We show that this dichotomy is not fundamental, and present a unified model of fault tolerance in which the number of tolerated faults of each type is a configuration choice. Additionally, we observe that a single fault is capable of devastating the performance of existing Byzantine fault tolerant replication systems. We argue that fault tolerant systems should, and can, be designed to perform well even when failures occur. In this talk I will expand on these two insights and describe our experience leveraging them to build a generic fault tolerant replication library that provides flexible fault tolerance and robust performance. We use the library to build a fault tolerant version of the Hadoop Distributed File system.
Read more

Exploring the Technical and Economic Factors Underlying Internet Scams

Prof. Geoffrey Voelker University of California, San Diego
16 Jan 2012, 11:00 am - 12:30 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Distinguished Lecture Series
Today, the large-scale compromise of Internet hosts serves as a platform for supporting a range of criminal activity in the so-called Internet underground economy. In this talk I will quickly survey work that our group has performed over the past decade on the problems posed by these threats, and how our research directions have evolved over time. In the remainder of the talk, I will describe recent work that our group has performed, including the ecosystem of CAPTCHA-solving service providers and an end-to-end analysis of the spam value chain. ...
Today, the large-scale compromise of Internet hosts serves as a platform for supporting a range of criminal activity in the so-called Internet underground economy. In this talk I will quickly survey work that our group has performed over the past decade on the problems posed by these threats, and how our research directions have evolved over time. In the remainder of the talk, I will describe recent work that our group has performed, including the ecosystem of CAPTCHA-solving service providers and an end-to-end analysis of the spam value chain. Using extensive measurements over months of diverse spam data, broad crawling of naming and hosting infrastructures, and product purchases from a wide variety of spam-advertised sites, I'll characterize the relative prospects for anti-spam interventions at multiple levels.

This work is part of a larger effort of the Collaborative Center for Internet Epidemiology and Defenses (CCIED), a joint NSF Cybertrust Center with UCSD and ICSI (http://www.ccied.org), and an ONR MURI collaboration (http://www.sysnet.ucsd.edu/botnets).
Read more

Dissent: Accountable Anonymous Group Communication

Bryan Ford Yale University
04 Jan 2012, 11:00 am - 12:00 pm
Saarbrücken building E1 5, room 5th floor
simultaneous videocast to Kaiserslautern building G26, room 206
SWS Colloquium
The ability to participate anonymously in online communities is widely valued, but existing anonymous communication protocols offer limited security against anonymous abuse, traffic analysis, or denial-of-service attacks. Dissent is a general-purpose anonymous messaging protocol that addresses these limitations in the context of group communication applications, where members of a well-defined group wish to "post" messages anonymously to each other or onto a common "bulletin board" without a message being linkable to a particular group member. Unlike most existing anonymous communication schemes, ...
The ability to participate anonymously in online communities is widely valued, but existing anonymous communication protocols offer limited security against anonymous abuse, traffic analysis, or denial-of-service attacks. Dissent is a general-purpose anonymous messaging protocol that addresses these limitations in the context of group communication applications, where members of a well-defined group wish to "post" messages anonymously to each other or onto a common "bulletin board" without a message being linkable to a particular group member. Unlike most existing anonymous communication schemes, the Dissent protocol provides anonymity while also holding its members accountable: if a group member deviates from the protocol in attempt to block communication or attack other members anonymously, the protocol ensures that the group can identify and expel the misbehaving member. The key technical idea enabling this combination of anonymity and accountability is to use a verifiable cryptographic shuffle scheme as a setup phase for a dining cryptographers (DC-nets) communication channel. The verifiable shuffle enables the group to create and agree on a random permutation of user identities to pseudonyms, then use that permutation as a logical schedule for subsequent DC-nets communication. The group can use this agreed-upon schedule to identify any member attempting to disrupt communication by deviating from the schedule or jamming the DC-nets channel. Currently working prototypes of Dissent support small groups, but ongoing efforts are extending the protocol to scale to large groups, handle node failure and network churn gracefully, and address intersection attacks against users who maintain long-term pseudonyms
Read more