DEI on Wednesday

Sparse Attention Mechanisms for Natural Language Processing 

Finalizado

I will start by giving a brief overview of my DeepSPIN ERC project (https://deep-spin.github.io), whose goal is to develop new deep learning methods, models, and algorithms for structured prediction in natural language processing (NLP). Then, I will cover in more detail some recent work done in my group on sparse attention mechanisms. Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the Transformer, learn powerful context-aware word representations through layered, multi-headed attention. The multiple heads learn diverse types of word relationships. However, with standard softmax attention, all attention heads are dense, assigning a non-zero weight to all context words. In this talk, I will introduce the adaptively sparse Transformer, wherein attention heads have flexible, context-dependent sparsity patterns. This sparsity is accomplished by replacing softmax with alpha-entmax: a differentiable generalization of softmax that allows low-scoring words to receive precisely zero weight. Moreover, we derive a method to automatically learn the alpha parameter—which controls the shape and sparsity of alpha-entmax—allowing attention heads to choose between focused or spread-out behavior. Our adaptively sparse Transformer improves interpretability and head diversity when compared to softmax Transformers on machine translation datasets. Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers. Furthermore, at no cost in accuracy, sparsity in attention heads helps to uncover different head specializations. Joint work with Ben Peters, Gonçalo Correia, Vlad Niculae, Chaitanya Malaviya, Pedro Ferreira, Julia Kreutzer, Mathieu Blondel, Claire Cardie, Ramon Astudillo.

access_time April 22, 2020 at 01:30PM
place Videoconference
face André Martins

Robust Revenue Maximization Under Minimal Statistical Information 

Finalizado

We study the problem of multi-dimensional revenue maximization when selling $m$ items to a buyer that has additive valuations for them, drawn from a (possibly correlated) prior distribution. Unlike traditional Bayesian auction design, we assume that the seller has a very restricted knowledge of this prior: they only know the mean $\mu_j$ and an upper bound $\sigma_j$ on the standard deviation of each item's marginal distribution. Our goal is to design mechanisms that achieve good revenue against an ideal optimal auction that has full knowledge of the distribution in advance. We show that selling the items via separate price lotteries achieves an $O(\log r)$ approximation ratio where $r=\max_j(\sigma_j/\mu_j)$ is the maximum coefficient of variation across the items. If forced to restrict ourselves to deterministic mechanisms, this guarantee degrades to $O(r^2)$. Assuming independence of the item valuations, these ratios can be further improved by pricing the full bundle. We demonstrate the optimality of the above mechanisms by providing matching lower bounds. Our tight analysis for the deterministic case resolves an open gap from the work of Azar and Micali [ITCS'13]. As a by-product, we also show how one can directly use our upper bounds to improve and extend previous results related to the parametric auctions of Azar et al. [SODA'13]. This talk is based on joint work with Yiannis Giannakopoulos and Alexandros Tsigonias-Dimitriadis.

access_time April 01, 2020 at 01:00PM
place Videoconference
face Diogo Poças

Designing for Emotional Meaning-Making with Data 

Finalizado

From Fitbit to Apple Watch to sensors embedded in walls, furniture, and underwear, an amassing amount of biosensory data about people's bodies, behaviors, thoughts, and feelings presents sense-making challenges and opportunities. While prevalent approaches leverage data analysis to promote individual productivity and normative wellness, my design research contributes alternative design tactics for engaging biosensory data to more effectively support social, embodied, and emotional meaning-making. I will demonstrate this concept through two projects. The first, color-changing garment Ripple, explores how ambiguity can be a valuable design tactic for inviting open-ended social emotional reflection. I created ordinary-looking shirts with embedded biosensors and display elements, and studied how pairs of friends interpreted the display throughout their daily lives. My second project, the Heart Sounds Bench, explores life-affirmation as an alternative design frame for public sensing. I created a bench that amplifies the live unfiltered heart sounds of bench-sitters, and studied how pairs of strangers experienced listening to their heart sounds emanate into the environment. Through this, I envision critically reworking conceptions of sensing and data to support different ways of knowing.

access_time March 11, 2020 at 01:30PM
place Videoconference
face Noura Howell

On the Evolution and Quality of Requirements: Industry’s Reality and Academia’s Efforts 

Finalizado

Requirements models have been developed for the requirements engineers and stakeholders work, providing abstraction mechanisms to, for example, facilitate the communication among them by providing better structuring of requirements, thus helping with their analysis. Nevertheless, the extent to which requirements modelling languages are used and adequate for communication purposes has been somewhat limited. On one hand we firstly performed a study of the evolution of requirements practices in industry, particularly of software startups as they grow and introduce new products and services. These startups operate in a dynamic environment, with significant time and market pressure, and rarely have time for systematic requirements analysis. We describe the evolution of practice along some dimensions (e.g. requirements artefacts, product quality) that emerged as relevant to their requirements activities. We provide a theory that organises knowledge about evolving requirements practice in maturing startups, and provides practical insights for startups’ assessing their own evolution as they face challenges to their growth. On the other hand, from the academia’s perspective, we have studied several quality aspects, ranging from lack of abstraction mechanisms to address model’s complexity, to the impact of layout of models or the actual notation adopted. So, in this talk, I will discuss these issues based on the application of Grounded Theory (in the study of requirements and startups) and the results of experiments where metrics were collected to evaluate and discuss some quality aspects of requirements models, in particular requirements goal models (increasingly popular in the requirements community).

access_time March 04, 2020 at 01:30PM
place Alameda - Sala José Tribolet (0.19) - Pavilhão Informática II | TagusPark - Sala 2N1.5 (through videoconference)
face João Araújo

Learning with Sparse Latent Structure 

Finalizado

Structured representations are a powerful tool in machine learning, in particular for natural language: The discrete, compositional nature of words and sentences leads to natural combinatorial representations such as trees, sequences, segments, or alignments, among others. Such representations are at odds with deep neural networks, which conventionally perform smooth, soft computations, learning dense, inscrutable hidden representations. We present SparseMAP, a strategy for inferring differentiable combinatorial latent structures, alleviating the tension between discrete and continuous representations through sparsity. SparseMAP computes a globally-optimal combination of a very small number of structures, and is applicable in arbitrary factor graphs, only requiring access to local maximization oracles. Our strategy is fully deterministic and compatible with familiar gradient-based methods for training neural networks. We demonstrate sparse and structured neural hidden layers, with successful empirical results and visualisation properties.

access_time February 26, 2020 at 01:30PM
place Alameda - Sala José Tribolet (0.19) - Pavilhão Informática II | TagusPark - Sala 2N1.5 (through videoconference)
face Vlad Niculae

Filtro por período  filter_list

date_range
date_range