Seminars PDF Print E-mail

A series of seminars have been recently completed on the following topics:

  • Bayesian Estimation
  • Kernel/SVM
  • Cognitive methods
  • Natural images methods
  • Inference (logical, statistical, cognitive)
  • First-Ordel Logic & Probability
  • Flexible Planning

Seminar Thursday 10 September 2015 - Prof. Anthony Cohn - 17:30 Room A2

Title: Learning about activities, spatial relations and spatial language from video

Speaker: Prof. Anthony Cohn, University of Leeds

Abstract: In this talk I will present work undertaken at Leeds on building models of activity from video and other sensors, using both supervised and unsupervised techniques. The representations exploit qualitative spatio-temporal relations to provide symbolic models at a relatively high level of abstraction. I will discuss techniques for handling noise in the video data and I will also show how objects can be "functionally categorised" according to their spatio-temporal behaviour. Finally I will present very recent results on learning and grounding language from video-sentence pairs.


Seminar Friday 3 July 2015 - Prof. Michael Beetz - 12:00 Aula Magna

Title: openEASE --- A Knowledge Processing Service for Robots and Robotics Researchers

Speaker: Prof. Michael Beetz, Institute for Artificial Intelligence, University of Bremen, Germany

Description: Making future autonomous robots capable of accomplishing human-scale manipulation tasks requires us to equip them with knowledge and reasoning mechanisms. We propose openEASE, a remote knowledge representation and processing service that aims at facilitating these capabilities. openEASE provides its users with unprecedented access to knowledge of leading-edge autonomous robotic agents. It also provides the representational infrastructure to make inhomogeneous experience data from robots and human manipulation episodes semantically accessible, as well as a suite of software tools that enable researchers and robots to interpret, analyze, visualize, and learn from the experience data. Using openEASE users can retrieve the memorized experiences of manipulation episodes and ask queries regarding to what the robot saw, reasoned, and did as well as how the robot did it, why, and what effects it caused.


Seminar Thursday 2 July 2015 - Dr. Rocco De Rosa - 14:30 Room B101

Title: Action Recognition in Streaming Videos via Incremental Active Learning

Speaker: Dr. Rocco De Rosa

Description: In this talk, we introduce a novel incremental and active learning classification approach that can be used with any local or global set of feature descriptors extracted from a segmented video stream. Our system is nonparametric: it covers the feature space with classifiers that locally approximate the Bayes optimal classifier. We focus on streaming scenarios, in which our approach features incremental model updates and on-the-fly addition of new classes. Moreover, predictions are computed in time logarithmic in the model's size (which is typically fairly small), and active learning is used to save labeling costs. A ``constant budget'' variant is also presented to limit the grow of model size over time, as an appealing feature in real-time applications. We apply this methodology to human activity recognition tasks. Experiments on standard benchmarks show that our approach is competitive with state-of-the-art non-incremental methods, and outperforms the existing active incremental baselines.


Seminar Monday 22 June 2015 - Prof. Nicolò Cesa-Bianchi - 11:30 Room B203

Title: An algorithmic approach to nonparametric online learning

Speaker: Prof. Nicolò Cesa-Bianchi, University of Milan La Statale

Description: In this talk, we describe a general algorithmic approach to nonparametric learning in data streams. Our method covers the input space using simple classifiers that are locally trained. A good balance between model complexity and predictive accuracy is achieved by dynamically adapting the cover to the local complexity of the classification problem. For the simplest instance of our approach, we prove a theoretical performance guarantee against any Lipschitz classifier and without stochastic assumptions on the stream. Experiments on standard benchmarks complement the theoretical results, showing good performance even when the model size is kept independent of the stream length.


Seminar Thursday 14 May 2015 - Vincenza Ferrara - 14:30 A4

Title: Quali tecnologie per la promozione, uso e riuso del Patrimonio culturale

Speaker: Dr. Vincenza Ferrara

Description: Nel corso degli ultimi anni le tecnologie stanno avendo un ruolo predominante anche nel settore dei beni culturali. L’agenda digitale Europea ha inoltre raccomandato, nel 2011, di rendere accessibili I contenuti culturali e di consentirne il riuso in ambiente educativo, turismo e industria creativa. L’accesso personalizzato ai contenuti via web, le applicazioni per tablet e mobile, i games sono prodotti utili da sviluppare in tale contesto. Si presentano qui alcune linee di tendenza nello sviluppo di tecnologie per coinvolgere i visitatori di musei nell’approfondimento di contenuti (realtà aumentata), per rendere accessibile il patrimonio culturale in ambito scolastico creando apposite piattaforme per l’annotazione di contenuti didattici e per la produzione di lezioni multimediali, per lo sviluppo di open data nel settore dei beni culturali e per il loro riuso anche nella produzione di app.


Seminar Tuesday 12 May 2015 - Constantine Raftopoulos - 10:00 B203

Title: Constantine Raftopoulos - Global Curvature and the Noising Paradox

Speaker: Dr. Constantine Raftopoulos, NTUA Athens, Greece

Description: Curvature, as a descriptor of shape (e.g. describing the boundary of planar shapes) possesses a rare combination of good properties: It is intrinsic, intuitive, well defined, extensively studied and of an undisputed perceptual importance. However, there are at least two serious problems concerning its such use in computer vision: One has to do with noise. In a noisy curve, having, that is, high frequency Fourier components (hfFc) of no perceptual importance, the local nature of curvature restricts it in describing the noise itself rather than the underlying shape. Knowing whether hfFc of a curve represent noise or not, would require solving the harder problem of recognizing the object. Since hfFc might be defining for certain shapes or just noise in others, their presence in unrecognized (unknown) shapes is considered problematic, albeit they may present useful shape information. In practice, they are usually eliminated from the boundary of all shapes, by means of a blind step of smoothing, at the risk of losing useful discriminating shape information. Smoothing also distorts the shape's metrics in an unpredictable manner, a highly undesirable effect whenever certain "morphometric" measurements are defining for classification. Another problem in relation to curvature as a descriptor has to do with "meaningfulness". Even in noise free curves, the local nature of curvature doesn't permit any kind of "context" by means of which one could differentiate between points of similar curvature with respect to their perceptual characteristics on different parts of the curve. Behind both of these problems is curvature's local nature as it seems that any solution would have to defy the local definition of curvature. In this talk the local nature of curvature will be challenged at a theoretical level as an attempt to address the above problems based on an alternative Global definition of curvature will be discussed. The new concept of "noising" (as opposed to smoothing) emerges as a paradox and a new method for identifying vertices without even having to calculate curvature will be presented. Experiments with smooth and noisy KIMIA and MPEG silhouettes and a comparison to localized methods support the theoretical findings.


Seminar Thursday 9 April 2015 - Tatiana Tommasi - 15:00 Aula Magna

Title: Learning to learn: how far we are from the solution

Speaker: Dr. Tatiana Tommasi, ESAT KU-Leuven

Description: The talk will give an overview of the current state of the art in learning to learn applied to visual recognition, highlight some success stories and underline what are the future challenges ahead.


Seminar Thursday 19 December 2013 - Francesco Orabona - 14:30 Aula Magna

Title: Adaptation in online learning through dimension-free exponentiated gradient

Speaker: Prof. Francesco Orabona, TTI Chicago, USA

Abstract: As the big data paradigm is gaining momentum, learning algorithms trained through fast stochastic gradient descent methods are becoming the de-facto standard in the industry world. Still, even these simple procedures cannot be used completely "off-the-shelf" because parameters, e.g. the learning rate, has to be properly tuned to the particular problem to achieve fast convergence. The online learning framework is a powerful tool to design fast learning algorithms able to work in both the stochastic and adversarial setting. In this talk I will introduce new advancements in the time-varying regularization framework for online learning, that allows to derive almost parameter-free adaptive algorithms. In particular, I will focus on a new algorithm based on a dimension-free exponentiated gradient. Contrary to the existing online algorithms, it achieves an optimal regret bound, up to logarithmic terms, without any parameter nor any prior knowledge about the optimal solution.


Seminar Thursday 19 December 2013 - Annie Ruimi - 11:00 Room A7

Title: Thread simulations for biomedical applications

Speaker: Annie Ruimi

Abstract: Over the last twenty years, sporadic efforts have been seen to bring computer simulations into the medical field. Large research institutions and training hospitals in the US (i.e. Stanford, Harvard, Massachusetts General Hospital) have now made simulations a mandatory part of their education but there is still much resistance from the medical profession. Within these efforts, modeling and simulations of organs have received more attention than the modeling of surgical tools. One reason being advanced by medical and health professional is the lack of realistic scenarios portrayed on screen that make it difficult for students to adequately train in various surgical tasks. Among them, suturing is recognized as being particularly challenging to medical school students. To address this problem, we have assembled an international team of scientists and researchers with expertise in engineering, medicine and computer graphics to develop a low-cost, interactive software that will be used by medical school students to train in the tasks of suturing and knotting. The program is funded by Qatar Foundation through its National Priorities Research Program (NPRP). I will give an overview of the many elements of the program and what we have achieved so far.

Speaker's short bio: Annie Ruimi joined Texas A&M University at Qatar in July 2007 as a Visiting Assistant Professor of Mechanical Engineering after obtaining her Ph.D. from the University of California at Santa Barbara and was promoted to Assistant Professor in July 2009. Her research uses a combination of theoretical and computational tools to solve problems represented by rod-like structures with applications in medical simulations and drillstring dynamics. She also investigates the relationship between microstructure and material properties to design advanced (or smart) materials for automotive applications. She teaches courses in Statics, Dynamics and Vibrations and Mechanics of Materials. She helped develop a sophisticated experimental set-up for the Rotor Dynamics Branch at NASA Ames Research Center (California) where she acquired experience in large data acquisition and management. She is an international collaborator on a large National Science Foundation (NSF) funding awarded to Texas A&M University (Texas) for the development of an International Institute for Multifunctional Materials for Energy Conversion (IIMEC), that bring together researchers from more than ten countries in the Middle-East, North Africa and the Mediterranean region. Her research is currently funded from the National Priorities Research Program (NPRP) managed by Qatar Foundation. She is a life member of Sigma Gamma Tau and Tau Beta Pi Engineering Honor Societies and a member of the American Society of Mechanical Engineers and the American Institute of Aeronautics and Astronautics.


Seminar Monday 11 November 2013 - Antonis Argyros - 15:00 Aula Magna

Title: Tracking the Motion of Human Hands

Speaker: Antonis Argyros

Abstract: Humans use their hands in most of their everyday life activities. Thus, the development of technical systems that track the 3D position, orientation and full articulation of human hands from markerless visual observations can be of fundamental importance in supporting a number of diverse applications. In this talk, we provide an overview of our work on hand tracking. First, we describe methods for vision-based detection and tracking of hands and fingers in 2D, with emphasis on occlusions handling and illumination invariance. We also demonstrate hand posture recognition techniques and their use in HCI and HRI. Then, we focus on a recently proposed framework for exploiting markerless visual observations to track the 3D position, orientation and full articulation of a human hand that moves in isolation in front of an RGBD camera. We treat this as an optimization problem that is effectively solved using a variant of Particle Swarm Optimization (PSO). Next, we show how the core of the tracking framework has been employed to provide state-of-the-art solutions for problems of even higher dimensionality and complexity, e.g., for tracking two strongly interacting hands or for tracking the state of a complex scene where a hand interacts with several objects. Finally, we demonstrate how the results of hand tracking have been used to recognize human actions and infer human intentions in the context of tabletop object manipulation scenarios.

Speaker's short bio: Antonis Argyros is an Associate Professor at the Computer Science Department, University of Crete and a researcher at the Institute of Computer Science (ICS),  Foundation for Research and Technology-Hellas (FORTH) in Heraklion, Crete, Greece. He received a B.Sc. degree in Computer Science (1989) and a M.Sc. degree in Computer Science (1992), both from the Computer Science Department, University of Crete. On July 1996, he completed his PhD on visual motion analysis at the same Department. He has been a postdoctoral fellow at the Computational Vision and Active Perception Laboratory (CVAP) at the Royal Institute of Technology in Stockholm, Sweden. Since 1999, as a member of the Computational Vision and Robotics Laboratory (CVRL) of FORTH-ICS, he has been involved in many RTD projects in computer vision, image analysis and robotics. He is an area editor for the Computer Vision and Image Understanding Journal (CVIU), member of the Editorial Board of the IET Image Processing Journal and one of the general chairs of the 11th European Conference in Computer Vision (ECCV'2010, Heraklion, Crete). He is also a faculty member of the Brain and Mind interdisciplinary graduate program and a member of the Strategy Task Group of the European Consortium for Informatics and Mathematics (ERCIM). The research interests of Argyros fall in the areas of computer vision with emphasis on tracking, human gesture and posture recognition, 3D reconstruction and omnidirectional vision. He is also interested in applications of computational vision in the fields of robotics and smart environments.


Seminar Monday 28 October 2013 - Catherine Pelachaud  - 12:00 Aula Magna

Title: Communicating with Socio-Emotional Agents

Speaker: Catherine Pelachaud,

Abstract: In this talk I will present our current work toward endowing virtual agents with socio-emotional capabilities. Through its behaviors, the agent can sustain a conversation as well as show various attitudes and levels of engagement. I will describe methods, based on corpus analysis, crowd-sourcing or motion capture, we are using to enrich its repertoire of multimodal behaviors. These behaviors can be displayed with different qualities and intensities to simulate various communicative intentions and emotional states. We have been developing a platform of humanoid agent able to interact with humans. I will describe the architecture of our platform allowing us to drive these different agents type. These agents, be virtual or physics, can be controlled by two different representation languages, namely Function Markup Language FML that specifies the communicative intentions and emotional states, and Behavior Markup Language BML that describes the multimodal behaviors to be displayed by the agents. Then, I will describe an interactive system of an agent dialoging with human users in an emotionally colored manner.

Speaker's short bio: Catherine Pelachaud is a Director of Research at CNRS in the laboratory LTCI, TELECOM ParisTech. She received her PhD in Computer Graphics at the University of Pennsylvania, Philadelphia, USA in 1991. Her research interest includes embodied conversational agent, nonverbal communication (face, gaze, and gesture), expressive behaviors and socio-emotional agents. She has been involved and is still involved in several European projects related to multimodal communication (EAGLES, IST-ISLE), to believable embodied conversational agents (IST-MagiCster, ILHAIRE, VERVE, REVERIE), emotion (Humaine, CALLAS, SEMAINE, TARDIS) and social behaviors (SSPNet).  She is member of the Humaine Association committee. She is associate editors of several journals among which IEEE Transactions on Affective Computing, ACM Transactions on Interactive Intelligent Systems and Journal on Multimodal User Interfaces. She has co-edited several books on virtual agents and emotion-oriented systems.


Seminar Monday 14 October 2013 - Ales Leonardis - 11:00 Aula Magna

Title: Hierarchical Compositional Representations of Object Structure

Speaker: Ales Leonardis, University of Birmingham, UK

Abstract: Visual categorisation has been an area of intensive research in the vision community for several decades. Ultimately, the goal is to efficiently detect and recognize an increasing number of object classes. The problem entangles three highly interconnected issues: the internal object representation, which should compactly capture the visual variability of objects and generalize well over each class; a means for learning the representation from a set of input images with as little supervision as possible; and an effective inference algorithm that robustly matches the object representation against the image and scales favorably with the number of objects. In this talk I will present our approach which combines a  learned compositional hierarchy, representing (2D) shapes of multiple object classes, and a coarse-to-fine matching scheme that exploits a taxonomy of objects to perform efficient object detection.


Seminar Monday 7 January 2013 - Paolo Favaro - 14:30 Aula Magna

Title: The Light Field Camera: Extended Depth of Field, Aliasing and Superresolution

Speaker: Paolo Favaro, University of Bern

Abstract: Portable light field cameras have demonstrated capabilities beyond conventional cameras. In a single snapshot, they enable digital image refocusing, i.e., the ability to change the camera focus after taking the snapshot, and 3D reconstruction. We show that they also achieve a larger depth of field than conventional cameras while maintaining the ability to reconstruct detail at high resolution. More interestingly, we show that their depth of field is essentially inverted compared to regular cameras. Crucial to the success of the light field camera is the way it samples the light field, trading off spatial vs. angular resolution, and how aliasing affects the light field. We present a novel algorithm that estimates a full resolution sharp image and a full resolution depth map from a single input light field image. The algorithm is formulated in a variational framework and it is based on novel image priors designed for light field images. We demonstrate the algorithm on synthetic and real images captured with our own light field camera, and show that it can outperform other computational camera systems.

Speaker's short bio: Paolo Favaro received the Laurea degree (BSc+MSc) from Università di Padova, Italy in 1999, and the M.Sc. and Ph.D. degree in electrical engineering from Washington University in St. Louis in 2002 and 2003 respectively. He was a postdoctoral researcher in the computer science department of the University of California, Los Angeles and subsequently in Cambridge University, UK. Between 2004 and 2006 he worked in medical imaging at Siemens Corporate Research, Princeton, USA. From 2006 to 2011 he was Lecturer and then Reader at Heriot-Watt University and Honorary Fellow at the University of Edinburgh, UK. In 2012 he became full professor at Universität Bern, Switzerland. His research interests are in computer vision, computational photography, machine learning, signal and image processing, estimation theory, inverse problems and variational techniques. He is also a member of the IEEE Society.


Seminar - Opensource Computer Vision and Robotics

Gary BradskiOpenCV Foundation and Founder and CTO at Industrial Perception.
Vincent Rabaud, computer vision research engineer at Willow Garage.

October 15, 2012
9:15 - 13:30


Aula Magna, Dipartimento di Ingegneria Informatica, Automatica e Gestionale "Antonio Ruberti" - Sapienza Università di Roma
via Ariosto 25 Roma

Abstract

The event has been co-organised by Cattid and the Department of Computer, Control and Management Engineering and is meant to be a tutorial on two very popular open source software frameworks: OpenCV and ROS.

The first part of the tutorial will cover OpenCV, the Open Source Computer Vision library of programming functions for real time computer vision.

Given the growing interests in commercial applications of computer vision, the tutorial will provide some basics in OpenCV development for the most widespread mobile platform nowadays: Android.

The second part of the tutorial is dedicated to ROS (Robot Operating System), providing libraries and tools to help software developers create robot applications.

This tutorial is intended to be hands on. Attendees can bring their laptop with pre-installed software development kits. The specification of all the required software will be provided to the registered attendees.

We kindly ask those who plan attending the tutorial to fill the registration form here. This gives us the chance of communicating the software pre-requisities for the hands-on exercises.


Bio sketch. 
Dr. Gary Rost Bradski is president and CEO of the OpenCV Foundation and Founder and CTO at Industrial Perception.

He is a Senior Scientist at Willow Garage, a robotics application incubator in Menlo Park. He also holds a joint appointment as Consulting Professor in Stanford University's Computer Sciences Department and has more than 50 publications, along with 13 issued patents and 18 pending. He brings more than 15 years of research and robotics experience to Willow Garage. Dr. Bradski is responsible for the Open Source Computer Vision Library (OpenCV) that is used globally in research, government and commercial applications, Earlier, Dr. Bradski organized the vision team for Stanley, the Stanford robot that won the DARPA Grand Challenge and more recently helped found the Stanford Artificial Intelligence Robot (STAIR) project under the leadership of Professor Andrew Ng.

Vincent Rabaud joined Willow Garage in January 2011 as a research engineer in computer vision. With a background in structure from motion his current focus is to teach a robot to recognize objects for grasping. Among other things, he is working on acquiring a database of household objects, developing 3D object recognition and fast feature detection on cellphones.

His research interests include 3D, tracking, face recognition and anything that involves underusing CPU's by feeding them very fast algorithms.

Dr. Rabaud completed his PHD at UCSD, advised by Professor Belongie. He also holds a MS in space mechanics and space imagery from SUPAERO and a BS/MS in optimization from the Ecole Polytechnique.


Seminar 3 October 2012 - Dr. Qinfeng (Javen) Shi . 14:00 Room A3 DIIAG

Title: Probabilistic Graphical Models, Kernel Methods, and Compressive Sensing


Abstract:This talk will give introduction to Probabilistic Graphical Models, Kernel Methods, and Compressive Sensing, and show our work in these topics (and more) in various applications and theory.


Speaker: Dr. Qinfeng (Javen) Shi DECRA Fellow School of Computer Science The University of Adelaide North Terrace Adelaide, SA 5005, Australia


Seminar 15 December 2010 - Matei Mancas . 12:00 Aula Magna DIS

Title: Overview of the research at the IT department of the University of Mons, Belgium

Speaker: Matei Mancas, University of Mons

The talk will begin with a brief introduction to the university in the French-speaking Belgian community. Then, some classical research within the IT department will be described:

  • speech recognition.
  • speech synthesis.
  • image processing.

In the second part, some new trends in the research of the IT department are addressed:

  • audio expressivity.
  • multimedia retrieval.
  • computational attention.

Finally, the NumediaArt research project, which deals with digital arts putting together a lot of the department research, is presented.

 


 

Seminar Friday 30 July 2010 - Jim Little -  11:00 Aula A6 DIS

Title: Actively Using Vision and Context for Home Robotics

Speaker: Jim Little, University of British Columbia

Abstract: Increasingly we want computers and robots to observe us and know who we are and what we are doing, and to understand the objects and tasks in our world, both at work and in the home. I will describe how we've built systems for mobile robots to find objects using visual cues and learn about shared workspaces. Further I will review how a range of visual capabilities permits the robot to work for and with humans.
We've demonstrated these abilities on Curious George, our visually-guided mobile robot that has competed and won the Semantic Robot Vision Challenge at AAAI (2007), CVPR (2008) and ISVC (2009), in a completely autonomous visual search task. In the SRVC visual classifiers are learned from images gleaned from the Web. Challenges include poor image quality, badly labeled data and confusing semantics (e.g., synonyms). Clustering of training data, image quality analysis, and viewpoint-guided visual attention enable effective object search by a home robot.


Seminar Friday 23 July 2010 - Stefano Soatto - 16:00 Aula A6 DIS

Title: Data to Information To Cognition: The Lesson from Shannon to Gibson, and the first steps towards a theory of visual information

Speaker: Stefano Soatto, UCLA

Abstract: I will discuss a notion of visual information as complexity not of the raw data, but of the images after the effects of nuisance factors such as viewpoint and illumination are discounted. It is rooted in ideas of J. J. Gibson, and stands in contrast to traditional information as entropy or coding length of the data regardless of its use, and regardless of the nuisance factors affecting it. Its computation is made possible by a recent characterization of the set of images modulo viewpoint and contrast changes, that induce group (invertible) transformations on the domain and range of the image. The non-invertibility of nuisances such as occlusion and quantization induces an "information gap" that can only be bridged by controlling the data acquisition process. Measuring visual information entails early vision operations, tailored to the structure of the nuisances so as to be "lossless" with respect to visual decision and control tasks (as opposed to data transmission and storage tasks implicit in traditional information theory). I illustrate these ideas on visual exploration, whereby a "Shannonian Explorer" navigates unaware of the structure of the physical space surrounding it, while a "Gibsonian Explorer" is guided by the topology of the environment, despite measuring only images of it, without performing 3D reconstruction. This operational definition of visual information suggests desirable properties that a visual representation should possess to best accomplish vision-based decision and control tasks.

 
© 2017 Alcor
Joomla! is Free Software released under the GNU General Public License.