Main Conference

ICIAP2017 Room 5 (36 seats) Room 6 (40 seats) Room 7 (40 seats) Room 8 (69 seats) Room 9 (71 seats) Room 13 (36 seats)
Sept. 11 WS: Automatic Affect Analysis & Synthesis (F/D) WS: Background learning for detection and tracking from RGBD Videos (H/D) WS: Social Signal Processing and Beyond (F/D) Tutorial: Active Vision and Human Robot Collaboration (F/D) WS: First International Workshop on Brain-Inspired Computer Vision (F/D) Tutorial: Quantitative imaging in monitoring response to treatment: Challenges and opportunities (H/D)
Sept. 12 WS: Natural human-computer interaction and ecological perception in immersive virtual and augmented reality (H/D) WS: Third International Workshop on Multimedia Assisted Dietary Management (F/D) WS: International Workshop on Biometrics as-a-service: cloud-based technology, systems and applications (F/D) Tutorial: Image Tag Assignment, Refinement and Retrieval (H/D) Tutorial: Humans through the eyes of a robot: how human social cognition could shape computer vision (H/D) Tutorial: Virtual Cell Imaging (methods and principles) (H/D)
Sept. 13  Main Conference
Sept. 14
Sept. 15


Title Quantitative imaging in monitoring response to treatment: Challenges and opportunities
Speaker Habib Zaidi, Ph.D
Date – Location Sept. 11 – Room 13
Abstract This talk reflects the tremendous increase in interest in molecular and dual-modality imaging (PET/CT, SPECT/CT and PET/MRI) as both clinical and research imaging modalities in the past decade. An overview of molecular mutli-modality medical imaging instrumentation as well as simulation, reconstruction, quantification and related image processing issues with special emphasis on quantitative analysis of nuclear medical images are presented. This tutorial aims to bring the biomedical image processing community a review on the state-of-the-art algorithms used and under development for accurate quantitative analysis in multimodality and multiparametric molecular imaging and their validation mainly from the developer’s perspective with emphasis on image reconstruction and analysis techniques. It will inform the audience about a series of advanced development recently carried out at the PET instrumentation & Neuroimaging Lab of Geneva University Hospital and other active research groups. Current and prospective future applications of quantitative molecular imaging are also addressed especially its use prior to therapy for dose distribution modelling and optimisation of treatment volumes in external radiation therapy and patient-specific 3D dosimetry in targeted therapy towards the concept of image-guided radiation therapy.


Title Virtual Cell Imaging (methods and principles)
Speaker David Svoboda
Date – Location Sept. 12 – Room 13
Abstract The interdisciplinary research connecting the pure image processing and pure biology/medicine brings many  challenging tasks. The tasks are highly practically oriented and their solution have a direct impact on the development   of some disease treatments or drugs development, for example. This talk aims at those students/researchers who plan joining some application-oriented research groups, where the segmentation or tracking methods for the proper analysis of fixed of living cells are developed or utilized. The attendees of this tutorial will be not only able to know and use the commonly available simulation toolkits or the benchmark image data produced by these toolkit to verify the accuracy of the inspected image analysis method. They will also understand the principles of these simulation frameworks and will be able to design and implement their own toolkits hand-tailored to their private data.

Title Image Tag Assignment, Refinement and Retrieval
Speaker Xirong LiTiberio UricchioLamberto BallanMarco BertiniCees SnoekAlberto Del Bimbo
Date – Location Sept. 12 – Room 8
Abstract In this half-day tutorial we focus on challenges in content-based image retrieval in the context of social image platforms and automatic image annotation, with a unified review on three closely linked problems in the field, i.e., image tag assignment, tag refinement, and tag-based image retrieval. Existing works in tag assignment, refinement, and retrieval vary in terms of their targeted tasks and methodology, making it non-trivial to interpret them within a unified framework. We reckon that all works rely on the key functionality of tag relevance, i.e., estimating the relevance of a specific tag with respect to the visual content of a given image. Given such a tag relevance function, one can perform tag assignment and refinement by sorting tags in light of the function, and retrieve images by sorting them accordingly. Consequently, we present a taxonomy, which structures the rich literature along two dimensions, namely media and learning. The media dimension characterizes what essential information the tag relevance function exploits, while the learning dimension depicts how such information is exploited. With this taxonomy, we discuss connections and difference between the many methods, their advantages as well as limitations.

A selected set of eleven representative and highly cited works have been implemented and evaluated on the test bed for tag assignment, refinement, and/or retrieval. To facilitate comparisons between the state-of-the-art, we present an open-source test bed comprising source code of these eleven methods and an experimental setup based on four social images datasets and on ImageNet; the testbed can be further expanded and using the proposed experimental setup it becomes possible to easily evaluate new methods. Moreover, we provide a brief live demo session with the methods, software and datasets. For repeatable experiments all data (e.g. features) and code are available online.

Title Active Vision and Human Robot Collaboration
Speaker Dimitri OgnibeneFiora PirriGuido De CroonLucas PalettaMario CeresaManuela ChessaFabio Solari
Date – Location Sept. 11 – Room 8
Abstract Unstructured social environments, e.g. building sites, release an overwhelming amount of information yet behaviorally relevant variables may be not directly accessible.

Currently proposed solutions for specific tasks, e.g. autonomous cars, usually employ over redundant, expensive and computationally demanding sensory systems which attempt to cover the wide set of sensing conditions which the system may have to deal with.

Active control of the sensors and of the perception process, Active Perception (AP), is a key solution found by nature to cope with such problems, as shown by the foveal anatomy of the eye and its high mobility and control accuracy. The design principles of systems that adaptively find and selects relevant information are important for both Robotics and Cognitive Neuroscience.

At the same time, collaborative robotics has recently progressed to human-robot interaction in real manufacturing. Measuring and modeling of human task specific gaze behaviour is mandatory for smooth human robot interaction supported.

Human-related variables that are related to human attention processes are essential for the evaluation of human-robot interaction metrics. Moreover, anticipatory control for human-in-the-loop architectures, which enable robots to proactively collaborate with humans, heavily relies on observed gaze and actions patterns of their human partners according.

The tutorial will describe several systems employing active vision to support robot behavior and their collaboration with humans.

The systems described employ different strategies:

  1. model based systems using information theoretical measures to select perception parameters;
  2. neural and bio-inspired perception controllers trained to support task execution;
  3. imitation based attention

Distinct complexities and corresponding solution are posed by different settings and tasks. The tutorial will present architectural designs and signal processing methods for active vision systems employed in:

  1. Disaster sites exploration
  2. Human robot collaboration in industrial tasks
  3. Smart Surgical room
  4. Light AUV Navigation
  5. Humanoid Companions
  6. Inspection and object recognition


Title Humans through the eyes of a robot: how human social cognition could shape computer vision
Speaker Nicoletta NocetiAlessandra Sciutti
Date – Location Sept. 12 – Room 9
Abstract The new frontiers of robotics research foresee future scenarios where artificial agents will be more and more participating to our daily life activities. If nowadays the presence in our house of robotic devices is limited to vacuum cleaners, pool cleaners and lawn mowers, it is plausible we will experience an extraordinary growth of robotics demand in the consumer sector. According to the EU Strategic Road Map 2014-2020, robotics applications are expected to influence not only domestic activities, but also entertainment, education, monitoring, security and assistive living. This will lead robots to frequent interactions with untrained humans in unstructured environments. The success of the integration of robots in our everyday life is then subordinated to the acceptance of these novel tools by the population. The level of comfort and safety experienced by the users during the interaction plays a fundamental role in this process. Hence, a  key challenge in current robotics has become to maximize the naturalness of human-robot interaction (HRI), to foster a pleasant collaboration with potential non-expert users. One possible approach to this goal is drawing inspiration from human-human interaction. Actually, humans have the ability of reading imperceptible signals hidden in others’ movements that reveal their goals and emotional status. This mechanism supports mutual adaptation, synchronization and anticipation, which cut drastically the delays and the need of complex verbal instructions in the interaction and result in seamless and efficient collaboration. In this tutorial we will discuss some guidelines for the design and the implementation of effective and natural HRI, that stems in the principles governing human-human interaction and its development since birth. To this aim, we will discuss the strong interconnections between applied robotics and neuro and cognitive science, showing that the development of human perception may be a rich source of inspiration for the design of intelligent robots able to proficiently understand and collaborate with humans. Particular emphasis will be given to motion analysis, discussing tasks addressed in this domain, methodologies, challenges and open questions, while delineating possible research lines for future developments.



Title First International Workshop on Brain-Inspired Computer Vision (WBICV2017)
Organizers  George AzzopardiLaura Fernández-Robles, Antonio Rodríguez-Sánchez
Date – Location Sept. 11 – Room 9
Web page
Description  The visual perception of a human is a complex process performed by various elements of the visual system of the brain. This remarkable unit of the brain has been used as a source of inspiration for developing algorithms that can be used in computer vision tasks such as finding objects, analysing motion, identifying or detecting instances, reconstructing scenes or restoring images. One of the most challenging goals in computer vision is, therefore, to design and develop algorithms that can process visual information as humans do.

The main aim of WBICV2017 is to bring together researchers from the diverse fields of computer science (pattern recognition, machine learning, artificial intelligence, high performance computing and visualisation) along with the fields of visual perception and visual psychophysics who aim to model different phenomena of the visual system of the brain. We look forward to discussing the current and next generation of brain system modelling for a wide range of vision related applications. This workshop aims to comprise powerful, innovative and modern image analysis algorithms and tools inspired by the function and biology of the visual system of the brain.

The researchers will present their latest progress and discuss novel ideas in the field. Besides the technologies used, emphasis will be given to the precise problem definition, the available benchmark databases, the need of evaluation protocols and procedures in the context of brain-inspired computer vision methods and applications.

Papers are solicited in, but not limited to, the following TOPICS:

  • Mathematical models of visual perception
  • Brain-inspired algorithms
  • Learning: Deep learning, recurrent networks, differentiable neural computers, sparse coding.
  • The appearance of neuronal properties: sparsity and selectivity
  • Circuitry: hierarchical representations and connections between layers.
  • Selecting where to look: saliency, attention and active vision.
  • Hierarchy of visual cortex areas
  • Feedforward, feedback and inhibitory mechanisms
  • Applications: object recognition, object tracking, medical image analysis, contour detection and segmentation


Title Third International Workshop on Multimedia Assisted Dietary Management (MADiMa 2017)
Organizers  Stavroula MougiakakouGiovanni Maria FarinellaKeiji Yanai
Date – Location Sept. 12 – Room 6
Web page
Description The prevention of onset and progression of diet-related acute and chronic diseases (e.g. diabetes, obesity, cardiovascular diseases and cancer) requires reliable and intuitive dietary management. The need for accurate, automatic, real-time and personalized dietary advice has been recently complemented by the advances in computer vision and smartphone technologies, permitting the development of the first mobile food multimedia content analysis applications. The proposed solutions rely on the analysis of multimedia content captured by wearable sensors, smartphone cameras, barcode scanners, RFID readers and IR sensors, along with already established nutritional databases and often require some user input. In the field of nutritional management, multimedia not only bridges diverse information and communication technologies, but also computer science with medicine, nutrition and dietetics. This confluence brings new challenges and opportunities on dietary management.

MADiMa2017 aims to bring together researchers from the diverse fields of engineering, computer science and nutrition who investigate the use of information and communication technologies for better monitoring and management of food intake. The combined use of multimedia, machine learning algorithms, ubiquitous computing and mobile technologies permit the development of applications and systems able to monitor the dietary behavior, analyze food intake, identify eating patterns and provide feedback to the user towards healthier nutrition. The researchers will present their latest progress and discuss novel ideas in the field. Besides the technologies used, emphasis will be given to the precise problem definition, the available nutritional databases, the need for benchmarking multimedia databases of packed and unpacked food and the evaluation protocols.

Topics of interest include (but are not limited to) the following:

  • Ubiquitous and mobile computing for dietary assessment
  • Computer vision for food detection, segmentation and recognition
  • 3D reconstruction for food portion estimation
  • Augmented reality for food portion estimation
  • Wearable sensors for food intake detection
  • Computerized food composition (nutrients, allergens) analysis
  • Multimedia technologies for eating monitoring
  • Smartphone technologies for dietary behavioral patterns
  • Deep Learning for food analysis
  • Food Images and Social Media
  • Food multimedia databases
  • Evaluation protocols of dietary management systems
  • Multimedia assisted self-management of health and disease


Title Social Signal Processing and Beyond (SSPandBE 2017)
Organizers Mariella DimiccoliPetia Ivanova RadevaMarco Cristani
Date – Location Sept. 11 – Room 7
Web page
Description The workshop provides a forum for presenting novel ideas and discussing future directions in the emerging areas of social signal processing in uncontrolled and virtual scenarios. It especially focuses on the interplay between computer vision, pattern recognition, social and psychological sciences. We strongly encourage papers covering topics coming from both the realms of social sciences and computer vision, proposing an original approach that takes from both the worlds. Furthermore, we invite contributions on the more ambitious topics of everyday interactions from wearable cameras, groups and crowd, social interactions in a “virtual” setting, unconventional social signals such as illumination and type of architecture.

Finally, the workshop will also feature an interactive session to explore existing and emerging research problems in the areas of interest for the workshop.

The relevant topics of interest for SSPANDBE include but are not limited to:

  • Multi-person/group/crowd interaction analysis
  • Situation awareness and understanding
  • First-person social interactions
  • Socially immersed first person cameras
  • Crowd/group analysis and simulation
  • Social scene and social context understanding
  • Social force models

The major criteria for the selection of papers will be their potential to generate discussion and influence future research directions. Papers have to present original research contributions not concurrently submitted elsewhere. Any paper published by the ACM, IEEE, etc. which can be properly cited constitutes research which must be considered in judging the novelty of a SSPandBE submission, whether the published paper was in a conference, journal, or workshop. Therefore, any paper previously published as part of a SSPandBE workshop must be referenced and suitably extended with new content to qualify as a new submission to the Research Track at the SSPandBE conference.

Paper submission is single blind and will be handled via EasyChair

For any question about the call for papers please contact



Title Natural human-computer interaction and ecological perception in immersive virtual and augmented reality (NIVAR2017)
Organizers Manuela ChessaFabio Solari, Jean-Pierre Bresciani
Date – Location Sept. 12 – Room 5
Web page
Description Given the recent spread of technologies, devices, systems and models for immersive virtual reality (VR) and augmented reality (AR), which are now effectively employed in various field of applications, an emerging issue is addressing how interaction occurs in such systems. In particular, a key problem is the one of achieving a natural and ecological interaction with the devices typically used for immersive VR and for AR, i.e. interacting with them by using the same strategies and eliciting the same perceptual responses as it occurs when interacting in the real world. This is particularly important when VR and AR systems are used in assistive contexts, e.g. targeting elderly or disable people, or for cognitive and physical rehabilitation, but also to prevent and mitigate visual fatigue and cybersickness when targeting healthy people.
The main scope of this workshop is to put together researchers and practitioners from both Academy and Industry, interested in studying and developing innovative solutions with the aim of achieving a Natural human-computer interaction and an ecological perception in VR and AR systems.Technical topics of interest include (but are not limited to):

  • Natural human-computer interaction in virtual/augmented/mixed reality environments.
  • Ecological validity of virtual/augmented/mixed reality systems and/or human-computer interaction.
  • Hand/ face/body recognition and tracking for human-computer interaction.
  • Action and activity recognition for human-computer interaction.
  • Vision neuroscience for human-computer-interaction.
  • Eye-tracking for human-computer interaction.
  • Computational vision models.
  • Depth (from stereo and/or other cues) and motion (also self-motion) perception in virtual/augmented/mixed reality environments.
  • Rendering in virtual/augmented/mixed reality environments.
  • Misperception issues and undesired effects in visualization devices (e.g., 3D displays, head-mounted displays)
  • Applications based on displays (also S3D), smartphones, tablets, head-mounted displays.


Title Automatic affect analysis and synthesis
Organizers Nadia Berthouze, Simone Bianco, Giuseppe Boccignone, Paolo Napoletano
Date – Location Sept. 11 – Room 5
Web page
Description Affective computing is a research field that tries to endow machines with capabilities to recognize, interpret and express emotions. On the one hand, the ability to automatically deal with human emotions is crucial in many  human computer interaction  applications. On the other hand, people express affects through a complex  series of actions relating to facial expression, body movements, gestures, voice prosody accompanied by a variety of physiological signals, such as heart rate and sweat, etc.

Thus,  goals set by affective computing involve a number of challenging issues on how systems should be conceived  built, validated, and compared.

In this perspective, we are soliciting original contributions that address a wide range of theoretical and practical issues including, but not limited to:

  • Facial expression analysis and synthesis;
  • Body gesture and movement recognition;
  • Emotional speech processing;
  • Heart rate monitoring from videos;
  • Emotion analysis from physiological signs;
  • Multimodal affective computing;
  • Affect understanding and synthesis.
  • Computational Visual Aesthetics;
  • Recognition of group emotion;
  • Tools and methods of annotation for provision of emotional corpora;
  • Affective Applications: medical, assistive; virtual reality; entertainment; ambient intelligence, multimodal interfaces;

Selected papers of the workshop will be invited to be extended for a special issue on a leading international journal.

Title International Workshop on Biometrics as-a-service: cloud-based technology, systems and applications.
Organizers Silvio BarraArcangelo CastiglioneKim-Kwang Raymond ChooFabio Narducci
Date – Location Sept. 12 – Room 7
Web page
Description Cloud-based Biometrics is a relatively new topic and solutions by emerging companies, e.g., BioID, ImageWare Systems, Animetrics and IriTech, further confirm the expectations of its rapid growing. Biometrics-as-a-service has the same benefits as any other cloud-based service. It is cost-effective, scalable, reliable and hardware agnostic, making enhanced security accessible anytime and anywhere. Moreover, legal and privacy issues vary from country to country, thus limiting the progress of this branch of the research on cloud computing. We therefore expect the contributions could also shed light on such less explored aspects.

Nowadays, the massive spread of cloud-based systems is leading the service providers to offer more advanced access protocols to their own users, which may overcome the limitations and the weaknesses of the traditional alphanumeric passwords. Experts all over the world are pushing for cloud-based biometric systems, which are supposed to be one of the upcoming research frontier of the next years. Biometric credentials are difficult to be stolen and do not need to be remembered, so making them suitable for on-the-move authentication scenarios, typical of the current mobile age. On the other hand, the remote storage of a biometric trait on the cloud is function creep-prone, i.e. the gradual widening of the use of a technology or system beyond the purpose for which it was originally intended. Legal and security issues related to the abuse & misuse of a biometric trait obstruct the rapid and widespread diffusion of such practice.

The objective of IW-BAAS is to capture the latest advances in this research field, soliciting papers and ideas above the cloud based biometric systems and services. Technical, legal, professional and ethical aspects related to the use of biometrics in cloud environments are also encouraged.

Topics of interest include, but are not limited to, the following:

  • Cloud-based Architectures for Biometric Systems;
  • Cloud-based Communication Protocols for Biometric Systems;
  • Biometric Security and Privacy Policy;
  • Ethical, legal, culture and regulation factors;
  • Biometric Storage in the Cloud;
  • Biometric Access Control of Cloud Data;
  • Mobile Biometrics and Cloud Computing;
  • Liveness/Spoofing Detection for Cloud Applications;
  • Biometric Cryptography;
  • Biometric Encryption in Cloud computing;
  • Biometric Fusion in the Cloud;
  • Smart spaces and Ambient Intelligence Environments;
  • Biometric representation suitable for the Cloud

Special Issues on IEEE Cloud Computing will be devoted to the conference topics and the best selected papers will be considered for publication, as extended versions.

Please note that:

  • papers must have been presented in the conference;
  • papers should have been carefully revised and extended with at least 30% of new original

Title Background learning for detection and tracking from RGBD Videos
Organizers Massimo CamplaniLucia MaddalenaLuis Salgado
Date – Location Sept. 11 – Room 6
Web page
Description The advent of low cost RGB-D sensors such as Microsoft’s Kinect or Asus’s Xtion Pro is completely changing the computer vision world, as they are being successfully used in several applications and research areas. Many of these applications, such as gaming or human computer interaction systems, rely on the efficiency of learning a scene background model for detecting and tracking moving objects, to be further processed and analyzed. Depth data is particularly attractive and suitable for applications based on moving objects detection, since they are not affected by several problems typical of color based imagery. However, depth data suffer from other type of problems, such as depth-camouflage or depth sensor noisy measurements, which bound the efficiency of depth-only based background modeling approaches. The complementary nature of color and depth synchronized information acquired with RGB-D sensors poses new challenges and design opportunities. New strategies are required that explore the effectiveness of the combination of depth and color based features, or their joint incorporation into well known moving object detection and tracking frameworks.

The aim of the Workshop is to bring together researchers interested in background learning for detection and tracking from RGBD videos, in order to disseminate their most recent research results, advocate and promote the research in this area, discuss rigorously and systematically potential solutions and challenges, promote new collaborations among researchers working in different application areas, share innovative ideas and solutions for exploiting the potential synergies emerging from the integration of different application domains.

The workshop comes with the companion SBM-RGBD Challenge specifically devoted to scene background modeling from RGBD videos, aiming at advancing the development of related algorithms and methods through objective evaluation on a common dataset and common metrics.


Special Sessions

Title Imaging Solutions for Improving the Quality of Life (I-LIFE’17)
Organizers Dan Popescu, Loretta Ichim
Description The session aims to underline the connection between complex image processing and the increasing the quality of life. This is an important challenge of the modern life, which needs interdisciplinary knowledge and effectively solves many problems encountered from different domains: computer science, medicine, biology, psychology, social policy, agriculture, food and nutrition, etc. This special session at the 19th International Conference on Image Analysis and Processing (ICIAP2017) provides a forum for researchers and practitioners to present and discuss advances in the research, development and applications of intelligent systems for complex image processing and interpretation for the increasing quality of life of the persons with disabilities, assisted persons or by detecting and diagnosing the possible diseases of normal persons.
The use of innovative techniques and algorithms in applications like image processing and interpretation for human behavior analysis and medical diagnosis leads to the increasing of life expectancy, wellbeing, independency of people with disabilities and to the improvement of ambient/ active assisted living (AAL) services. For example: the image interpretation for earlier detection of the chronic depression can help to prevent severe diseases; the patient-centric radiation oncology imaging provides a more efficient and personalized cancer care; new methods for the visually impaired (transform visual information into alternative sensory information, or maximizing the residual vision through magnification); eye vasculature and diseases analysis based on image processing software; medical robots controlled by images and so on. Others factors that influences the quality of life refer to food analysis and pollution preventing. So, computer vision exceeds the human ability in: real time inspection of food quality (outside visible spectrum and long term continuous operation); food sorting and defect detection based on color, texture, size and shape; chemical analysis through hyperspectral or multispectral imaging; image processing in agriculture (robotics, chemical analysis, detecting pests, etc.). Also, the quality of life can be determined by: air pollution detection (dust particles detection from ground and remote images, air density pollutants); waste detection and management based on interpretation of aerial images. In the case of disasters like flood, earthquake, fire, radiation, the image interpretation from different sources (ground, air and space) can be successfully used for improving and saving the life (prevention, monitoring and rescue).
The included topics are the following (but not limited): Criteria for efficient feature selection depending on application; Image processing from multi-sources based on neural networks; Medical diagnosis based on complex image processing; New approaches for gesture recognition and interpretation; Assistive technologies based on image processing; Understanding of indoor complexity for persons with disabilities; Ambient monitoring based on image processing; Image processing for quality inspection in food industry; Image processing for the precision and eco agriculture; Image processing for flooding prevention and evaluation.