FHF : A Frontal Human Following technology for mobile robots
Robots following humans is an effective and practical function, particularly in the context of service robotics. However, the majority of existing research has focused on the robot following the human from behind, with relatively little attention given to the robot operating in front of the human. This project of frontal following, where the robot remains within the user’s field of view, is more reassuring and facilitates interaction. New challenges will arise when developing a tracker that is able to estimate the user’s pose from 2D LiDAR at knee height, especially when the legs frequently occlude each other. There is also a need to ensure the safety of the user while addressing how the robot can keep up with them in situations where it may fall behind.
The context
Mobile robots are increasingly ubiquitous in various settings, such as shopping centers, hospitals, warehouses, and factories. Many tasks in these applications are shared between robots and human operators through voice, video, force interaction, etc., either because inherent human expertise or agility is required for certain tasks or because the robot can provide potential assistance to the operator. In this project, we focus on investigating the automatic tracking and following of the user by the mobile robot based on 2D LiDAR. Common rear-following algorithms keep the robot at a certain distance from the user. Some studies have shown that people prefer to see robots in their field of vision, and may feel uncomfortable and unsafe when robots appear behind them. Moreover, some specific services require robots to come in front of users. For example, assistance robots act like guide dogs to provide navigation assistance for the visually impaired. Therefore, frontal following is progressively becoming popular and valuable.
Objectives
The scientific objectives of this project are as follows:
- Build a human pose (orientation and position) tracker based on 2D LiDAR at knee height for mobile robots by studying human gait ;
- Collect LiDAR scans from different volunteers and ground truth data on human orientation to build a data-driven model to improve human orientation estimation ;
- Solve the problem of self-occlusion of legs during scanning by modeling gait or using machine learning techniques ;
- Develop a motion generator to enable the robot to move safely and naturally in front of the user.
The results
Twelve volunteers (three females and nine males) were invited to participate in the frontal following experiment, and the human pose tracker based on human gait performed well. The quantitative analyses showed that the mean absolute error (MAE) of position was around 4 cm, and the MAE of orientation was less than 12 degrees for complex walking.
Data collected over six hours on five volunteers (one female and four males) were used to build model(s) to improve orientation estimation. By solving the problem of delay, the customized model achieved an MAE of between 4 and 7 degrees for all five volunteers.
The frontal following motion generator enables the robot to come naturally in front of the user, always at a safe distance of one meter during the experiment. For more details, see the video.
Publications
- 2D LiDAR-Based Human Pose Tracking for a Mobile Robot, ICINCO 2023 ;
- Human Orientation Estimation from 2D Point Clouds Using Deep Neural Networks in Robotic Following Tasks, submitted to IROS 2024 ;
- Large Workspace Frontal Human Following for Mobile Robots Utilizing 2D LiDAR, submitted to JIRS.
TELIN – The Laughing Infant
The TELIN project focuses on developing a socially interactive robotic agent (SIA) capable of replicating the behavior of an infant during laughter acquisition. Its main challenges lie in modeling the robot’s laughter and in real-time decision-making on when to laugh, taking into account the cognitive state of infants, thus surpassing currently assumed cognitive capacities.
To address these challenges, TELIN compiles a vast corpus of recordings of infants laughing in various contexts and develops methods for manual and automatic annotation. The project then analyzes the production of laughter in infants to create a formal model. Based on this model, TELIN develops and evaluates a computational model that enables the robot to decide and generate laughter in real-time during interactions. This initiative requires interdisciplinary collaboration between formal linguistics, artificial intelligence, and audio signal processing.
The context
Laughter, one of the earliest forms of communication in infants, emerges as early as three months of age, well before language, gestures, or walking. Recent studies have highlighted the close link between the acquisition of laughter and advanced cognitive skills, particularly related to understanding negation, offering an intriguing perspective on the evolution of human communication.
The TELIN project draws on a synthesis of diverse research areas including language acquisition, semantics and pragmatics of laughter, Socially Interactive Agents (SIAs), as well as laughter analysis and synthesis, combined with advancements in machine learning. Its aim is to develop an SIA capable of mimicking an infant during laughter acquisition, and to use this SIA to evaluate multiple learning algorithms. These algorithms take into account different input modalities such as audio, facial expression, and language, as well as various contexts such as playing with toys and family interactions, to generate laughter responses.
The project is supported by the Mission for Transversal and Interdisciplinary Initiatives (MITI) of CNRS, which funds interdisciplinary research projects by financing doctoral allocations for a period of three years, coupled with research budgets for the first two years.
Objectives
The thesis topic of the TELIN project is to develop formal and computational models that calculate when and how a robotic infant (using the Furhat robot with a baby mask) responds to human participant expression and activity. The focus is on infant laughter production. This entails:
- Analysis of a corpus of infant laughter,
- Development of a rigorous theoretical analysis of laughter during interaction between a parent and infants,
- Development of a computational model based on deep learning approaches that simulates when laughter should be triggered.
The models will be objectively evaluated as well as through experimental studies.
The results
The integration of language and non-verbal communication is a crucial goal for AI. TELIN advances this field by studying it in a simpler yet ecologically valid environment, regarding natural language understanding, interaction, and world knowledge.
Modeling when laughter should occur in an interaction between humans and AI is still in its infancy. Research within the TELIN framework will address this question and contribute to refocusing efforts in this direction. Additionally, the development of a computer model of a laughing virtual agent (integrated into the Greta platform) will benefit the AI community by providing a new sequence-to-sequence architecture.
Finally, TELIN will provide a platform for a more ecologically valid study of communication development, given the emphasis on multimodal interaction. It will provide detailed empirical and formal reports on the emergence of laughter, a relatively unexplored domain. The SIA platform stemming from TELIN will be available for conducting human-agent studies.
Partnerships and collaborations
The project is led by Université Paris-Cité and also involves:
- ISIR at Sorbonne University,
- and the Sciences and Technologies of Music and Sound (STMS) laboratory at Sorbonne University.
Projet GUIDANCE – “General pUrpose dIalogue-assisted Digital iNformation aCcEss”
GUIDANCE project aims at federating the Information Retrieval (IR) French Community project, by bringing together experts of the field to advance the development of Dialogue-based Information Access (DbIA) models leveraging LLMs.
The aim of the project is to develop new models and resources for interactive information access, e.g. dialoguing with a computer system in order to access (possibly automatically generated) information, while ensuring, on the one hand, adaptation to domains or languages with low resources (compared with English), and on the other hand, the explicability and veracity of the information generated.
The context
The GUIDANCE project takes place in the context of large language models (LLMs) and conversational systems (e.g. ChatGPT, WebGPT), which have experienced tremendous practical progress in the last few months. The project GUIDANCE aims to conduct research on General Purpose Dialogue-assisted Digital Information Access, specifically how to enable users to access digital information, with the goal of overcoming several limitations of current LLMs:
- LLMs were not designed with Information Access – whether at the level of pre-training tasks or fine-tuning ones
- LLMs have limited generalization abilities to new domains and/or languages;
- The veracity and truthfulness of the output are questionable.
- Potentially state-of-the-art LLMs models are not open access and the scientific methodology and proper evaluation are barely described in the scientific literature.
Objectives
From a research perspective, GUIDANCE addresses four challenges associated with this project:
- How to design new LLMs or re-use LLMs for DbIA;
- How to leverage retrieval-Enhanced Machine Learning (ReML) techniques to improve the accuracy and efficiency of information retrieval systems;
- Adapt LLMs and develop new architectures (for DbIA models) to deal with low resource and domain adaptation – with special attention paid to the low/medium-resource languages (e.g. Occitan, French);
- Design DbIA models that can ensure the veracity and explainability of retrieved and synthesized information, while preserving the user’s subjectivity.
The results
The GUIDANCE project is expected to deliver multiple results, paving the way for significant advances in the field of information access.
Firstly, the development of resources for training information access models (made available to the community). These are learning corpora that can be used to train new, more powerful models.
Secondly, the project aims to develop new modes of interaction with information access systems: a search engine can be pro-active in guiding the user towards relevant results (much more than by proposing related questions, as is currently the case).
Finally, the provision of pre-trained models for information access, which will enable these interactive models to be used freely, whether for research or other purposes.
Partnerships and collaborations
Led by Benjamin Piwowarski, CNRS research fellow at ISIR (MLIA team), the GUIDANCE project also involves:
- the Institute of Research in Computer Science of Toulouse (IRIT) through the two IRIS and SIG research teams,
- the Grenoble Computer Science Laboratory (LIG) through the APTIKAL and MRIM research teams,
- and the Computer Science and Systems Laboratory (LIS) through the R2I research team.
The project, which began in October 2023, brings together 18 researchers from 6 research groups in Information Retrieval (IR) and Natural Language Processing (NLP).
NeuroHCI project – Multi-scale decision making with interactive systems
This multi-disciplinary project uses Computational Neuroscience to develop HMI models of user behavior. The aim is to study the extent to which Computational Neuroscience theories, models and methods can be transposed to HMI.
The NeuroHCI project aims to improve human decision-making in the physical and digital worlds in interactive contexts. The situations in which a human makes a decision with an interactive system are varied:
Do I use my experience or Google Maps to choose my route? Do I reply to this e-mail on my smartphone or PC? Do I use menus or shortcuts to select this frequent command? Do I use the Da Vinci surgical robot to operate on my patient, or traditional laparoscopic instruments? How can I reach this object with my robotic prosthesis?
The decision may concern a complex computer-aided real-world choice (e.g., a medical treatment) or the choice of a method for carrying out a digital task (e.g., editing a photo with the preferred tool).
The context
Neuroscience studies phenomena involving both decision-making and learning in humans, but has received little attention in HCI.
The NeuroHCI project is a human-computer interaction (HCI) project that aims to design interactive systems that develop user expertise by establishing a human-machine partnership. Interaction with these systems can be seen as a multi-scale decision-making problem:
- A task, e.g. choosing the right medical treatment on the basis of AI-based recommendations;
- A method, e.g. choosing between different devices or modalities to perform a task;
- An object, e.g. which physical or virtual object users will interact with;
- A movement, e.g. which trajectory to take to reach the target object.
Objectives
The scientific objective is to understand how users make decisions with interactive systems, and how these decisions evolve over time. Indeed, users gradually develop expertise over the course of repeated use of interactive systems. This expertise influences the way users make decisions. This requires the simultaneous study of the learning and decision-making phenomena underlying the use of interactive systems.
The application goal is to design and implement better interactive and adaptive systems. Human beings adapt and develop their expertise by using an interactive system. The aim here is for the system, for its part, to also evolve to adapt to its users, i.e. to become accustomed to user behavior and in particular to their expertise. The aim is to establish a man-machine partnership in which both actors (human and machine) adapt to each other.
The results
To achieve these objectives, we demonstrate the benefits of our approach through 3 applications, for which platforms already exist and are maintained by the partners, but where scientific challenges remain for their adoption in the real world. These three applications are:
- intelligent graphical interfaces such as AI-based recommendation systems;
- immersive simulation systems offering rich haptic feedback;
- and medical cobotic interfaces, which aim to restore or enhance the ability of humans to interact with objects in the real world.
Our research hypothesis is that it is necessary to develop robust computational models of learning and decision-making in HMIs. Computational models make it possible to explain and predict human behavior by synthesizing complex phenomena in a testable and refutable way. In HMI, they can be used to assess the quality of an interface without the need for costly and time-consuming user studies. When these models are robust, they can be integrated into interactive systems to optimize interaction and adapt the interface to users’ expertise and/or actions.
Partnerships and collaborations
Led by Gilles Bailly, CNRS Research Director at ISIR, the ANR NeuroHCI project is a multi-team project at ISIR, involving several members of the laboratory.
Tralalam Project – Translating with Large Language Models
The TraLaLaM project aims to explore the use of large language models (LLMs) for machine translation, by asking two main questions:
- in what scenarios can contextual information be used effectively via prompts?
- for low-resource scenarios (with a focus on dialects and regional languages), can LLMs be trained effectively without any parallel data?
Accepted under the ANR 2023 call on very large language models (LLM), the project is positioned at the crossroads of artificial intelligence, linguistics and machine translation.
The context
Trained on multimodal gigacorpora, language models (LLMs) can be used for a variety of purposes. One possible purpose is machine translation, a task for which the LLM-based approach provides a simple answer to two difficult points:
- support for an extended and enriched context (including translation examples or bilingual terminological entries) ;
- handling translation domains and directions for which learning data is sparese or even non-existent.
The objectives
The main objective of the project is to analyze in depth the relevance of LLMs .
On the one hand, we will focus on industrial use cases by studying scenarios for domain adaptation, terminology data or translation memories, which correspond to realistic situations. On the other hand, we will focus on the realization of a machine translation system from and into all the languages of France, based on a massively monolingual LLM trained with little (if any) parallel data.
Significant scientific challenges lie ahead, such as extending pre-trained models to new, poorly endowed languages, or handling highly idiomatic texts featuring numerous instances of code switching between a minority language and French.
The results
From an industrial point of view, Tralalam aims to evaluate the computational costs and trade-offs VS performance induced by the use of LLMs in machine translation. These new architectures could profoundly transform the way translation systems are trained and operationally deployed. Current solutions however, are either to costly to deploy, or underperform dedicated machine translation engines that are optimized for this task.
With regard to the languages of France, in partnership with various players representing the linguistic communities concerned, we aim to develop operational solutions for certain well-targeted applications, such as the translation of Wikipedia pages, administrative or regulatory texts, etc.
Partnerships and collaborations
Led by Systran, the Tralalam project also involves:
- ISIR at Sorbonne University,
- and the ALMAnaCH project-team at the Inria center in Paris.
CAVAA project – Counterfactual Assessment and Valuation for Awareness Architecture
The CAVAA project proposes that awareness serves survival in a world governed by hidden states, to deal with the “invisible”, from unexplored environments to social interaction that depends on the internal states of agents and moral norms. Awareness reflects a virtual world, a hybrid of perceptual evidence, memory states, and inferred “unobservables”, extended in space and time.
The CAVAA project will realize a theory of awareness instantiated as an integrated computational architecture and its components to explain awareness in biological systems and engineer it in technological ones. It will realize underlying computational components of perception, memory, virtualization, simulation, and integration, embody the architecture in robots and artificial agents, validate it across a range of use-cases involving the interaction between multiple humans and artificial agents, using accepted measures and behavioural correlates of awareness. Use cases will address robot foraging, social robotics, computer game benchmarks and human-generated decision trees in a health coach. These benchmarks will focus on resolving trade-offs, e.g. between search efficiency and robustness, and assess the acceptance of human users of aware technology.
CAVAA’s awareness engineering is accompanied by an ethics framework towards human users and aware artefacts in the broader spectrum of trustworthy AI, considering shared ontologies, intention complementarity, and behavioural matching, an empathy, relevance of outcomes, reciprocity, counterfactuals and projections towards new future scenarios, and to predict the impact of choices. CAVAA will deliver a better user experience because of its explainability, adaptability, and legibility. CAVAA’s integrated framework redefines how we look at the relationship between humans, other species and smart technologies because it makes the invisible visible.
The context
Following Thomas Nagel, conscious agents have an awareness of what it is like to be that agent. This first- person definition precludes a third person scientific pursuit leading to the, so called, hard problem or explanatory gap, with its pragmatic solution of distinguishing phenomenal from access consciousness. Conscious awareness can further be characterized in terms of its level, from coma to alertness, and content or quale pertaining to the distinction between the outside world and the self and the level of abstraction. In the face of these definitional challenges theories of consciousness have been emphasizing different non- exclusive aspects such as grounding in the self and sensori-motor contingencies, complexity, information access, prediction, attention, or meta-representations. However, none of these theories give any hypothesis on what the function of consciousness could be, let alone its role in future technology and artificial intelligence systems. Rather, the suggestions range from panpsychism to epiphenomenalism, or the realization of specific cognitive functions. Not surprisingly, the realization of aware machines is considered implausible.
The CAVAA project dissents from this position. CAVAA proposes that awareness has a specific function in the control of adaptive behavior which emerged during the Cambrian explosion: the ability to survive in a world governed by hidden states, especially those pertaining to other agents. Indeed, CAVAA proposes that awareness allows agents to decouple from immediate sensory states and deal with the “invisible”, ranging from behavioral policies in unexplored environments and unobservable aspects of tasks, to the complexities of social interaction that depend on the internal states of agents (e.g. intentions, knowledge, and emotions), and the moral norms that guide collective interaction action. Awareness thus reflects a virtual world which is a hybrid of perceptual evidence, memory states, and inferred “unobservables”, extended in space and time and which rests on five core processes: the ability to virtualize task spaces, to merge “real” and virtual elements into these internal models, to run parallel future-oriented simulations of possible world-self states, to collapse these into a single conscious scene which defines the content of awareness, and to use awareness to bias valuation and memory consolidation.
In realizing this goal, the CAVAA consortium will build on our growing understanding of the biological basis of awareness and its role in constructing the virtualized world models in which mental life occurs.
The objectives
The project will focus on setting up the data management plan, organize the data collection workflows, set up the ethical and legal framework, defining the requirements and specifications of the technology, and performing a patent and market analysis and lay the foundation for the CAVA architecture and its components, validation use cases and metrics, and ethics and legal aspects specifications. The main objectives of the project are :
- elaborate on this activity and ramp up the interfacing of the architecture to external systems and overall integration;
- deploying the use cases and validating obtaining the first results, subject to further analysis and the requirements update. These results will also identify possible limitations in the architecture and its components;
- update the architecture to make it a turn-key solution and advancing the most challenging benchmark tasks. The architecture and its underlying code will be packaged and documented for public release pending the objectives of the Small and Medium Enterprises for further commercialisation.
The results
The CAVAA project will develop a beyond state-of-the-art cognitive control architecture for advanced synthetic systems validated in spatial and social interaction tasks. The realisation of the CAVAA architecture advances state of the art in artificial intelligence, human-robot interaction, and neuroscience-based computing by delivering an integrated architecture and impacting several research domains: from theoretical and computational neuroscience and cognitive science to engineering, philosophy and social sciences. CAVAA’s five-level approach includes cognitive architecture, computing systems, machine awareness, embodiment, and conscious behaviour.
Partnerships and collaborations
The project is funded by the European Innovation Council (EIC) of the European Union, under the reference: EIC 101071178. It is a European collaborative project including the following partners:
- Radboud University, The Netherlands
- Centre for Research & Technology, Hellas (CERTH), Greece
- University of Technology Chemnitz, Germany
- Sorbonne University, France
- Eodyne, Spain
- Robotnik, Spain
- Uppsala University, Sweden
- Tp21, Germany
- University of Oxford, United Kingdom
- University of Sheffield, United Kingdom
ANITA project – Adaptive iNterfaces for accessible and Inclusive digiTal services for older Adults
The ANITA project addresses the issue of accessibility and e-inclusion for older adults. Using an multidisciplinary and integrative approach, we aim to contribute to a better understanding of older adults needs and preferences regarding the access to digital services using virtual assistants. Qualitative and experimental methods (associating clinical assessments, UX (user experience) approaches, machine learning, social signal processing techniques, social sciences methods) will allow a better comprehension of older adults needs so that more inclusive, effective, useful, and acceptable virtual assistant interfaces are designed.
At its completion, the project will propose concrete ways to bridge the digital divide for older adults by identifying key points for an accessible and acceptable interface design. Such findings will be summarized into a set of practical guidelines for stakeholders willing to engage in an inclusive design approach or to implement and use these solutions. ANITA will also provide a technological and conceptual basis for other publics with specific needs that could also benefit from the dynamic adaptability of virtual assistants for different tasks.
The context
Digital technologies have become indispensable in our daily lives, but they can pose challenges for older people who are not always familiar with these tools. Digital technology is often a source of exclusion for older people, who can feel excluded from social and cultural life. That’s why the ANITA project has been set up to help older people get to grips with digital tools and use them independently.
The ANITA project – Adaptive Interfaces for accessible and Inclusive digiTal services for older Adults – is an ANR project aimed at improving digital accessibility and inclusion for older people.
The objectives
The main objective of ANITA is the design, development and assessment of an adaptive virtual assistant platform, for providing access to digital services, that is able to accommodate different needs and capacities of older adults. Two key features of the project are:
- the focus on automatic and dynamic recognition of user’s behavior (verbal and non-verbal), which will serve as input to the system for providing customized adjustments of accessibility parameters and interaction modalities,
- and the design of interface behaviors for the virtual assistants that promote accessibility, effective and positive interactions when using the system.
The results
The main application is the creation of interfaces to facilitate access to services.
ANITA will provide a comprehensive approach to the conception and use of virtual assistants for older adults, addressing themes such as the digital divide, social inclusion, representations as well as advantages and drawbacks related to the use of virtual assistants by older adults themselves and by technology designers. We will also examine ethical risks (e.g., deception, attachment, replacement of human assistance, vulnerability, stigmatization) and legal concerns (e.g. consent, privacy, security) regarding technologies using AI, user data collection, and user profiling for the effective operation of the services.
Partnerships and collaborations
Coordinated by Broca Hospital (APHP), the project involves several European partners, namely:
- the Grenoble Computer Science Laboratory (LIG),
- the Public Assistance Hospitals of Paris (APHP),
- ISIR at Sorbonne University,
- and Spoon.
MaTOS project – Machine Translation for Open Science
The MaTOS (Machine Translation for Open Science) project aims at developing new methods for machine translation (MT) of documents, by addressing both terminological modeling problems and problems of discourse processing and its organization in a framework of automatic text generation. Finally, it includes a component dealing with the study of evaluation methods and a large-scale experimentation on the HAL archive.
The context
Scientific English is the lingua franca used in many scientific fields to publish and communicate research results. However, in order for these results to be accessible to students, science journalists or decision-makers, translation must take place. The language barrier, therefore, appears to be an obstacle that limits or slows down the dissemination of scientific knowledge. Can machine translation help to overcome these challenges?
The MaTOS project – Machine Translation for Open Science – is an ANR project that aims to improve the circulation and diffusion of scientific knowledge through improved machine translation. It aims to create an experimental machine translation tool around science, a field where this technology sometimes faces difficulties.
The objectives
MaTOS aims at developing new methods for full machine translation (MT) of scientific documents, as well as automatic metrics to evaluate the quality of the produced translations. Our main application target is the translation of scientific articles between French and English, where linguistic resources can be exploited to obtain more reliable translations, both for publication support and for reading and text mining purposes. However, efforts to improve machine translation of complete documents are hampered by the inability of existing automatic metrics to detect weaknesses in the systems and to identify the best ways to remedy them. The MaTOS project proposes to address both of these challenges head on.
The results
This project is part of a movement to automate the processing of scientific articles. The field of machine translation is no exception to this trend, especially in the bio-medical field. The applications are numerous: text mining, bibliometric analysis, automatic detection of plagiarism and articles reporting falsified conclusions, etc. We wish to take advantage of the results of these works, but also to contribute to it in many ways:
- by developing new open resources for specialized machine translation ;
- by improving the description of textual coherence markers for scientific articles through the study of terminological variations;
- by studying new multilingual processing methods for these documents;
- by proposing metrics dedicated to measuring progress in this type of task.
The final result will allow, through improved translation, to fluidify the circulation and dissemination of scientific knowledge.
Partnerships and collaborations
Coordinated by François Yvon, researcher at ISIR (MLIA team) of Sorbonne University, the MaTOS project brings together three other partners:
- CLILLAC (Center for Inter-Language Linguistics, Lexicology, English Linguistics and Corpus Workshop),
- Inria,
- and Inist (Institute for Scientific and Technical Information).
MARGSurg project – Markerless Augmented Reality for the Future Orthopedic Surgery
In the MARSurg project, we are targeting the joint replacement segments. The targeted solution aims to be generic and easily adaptable to other orthopedic surgery disciplines and beyond. Focused on efficiency, the MARSurg demonstrator will address the optimal placement of prostheses in knee surgery, with the aim of performing regular transpositions and verification tests on other orthopedic surgeries (such as shoulder or hip).
The context
With the aging of the population, the number of surgeries to replace failing joints (hip, knee, shoulder, etc.) is growing rapidly. This represents more than one third of the implantable medical device market.
In orthopedic surgery, the 3D positioning of failed joints and artificial replacement prostheses is an important criterion for successful surgery. This geometric and kinematic information is usually obtained with a set of specific, often invasive, metallic instruments. The estimation of the spatial position of prostheses has made significant progress with the development of medical imaging and computer-assisted navigation methods and robotics. However, even if these methods bring a real clinical added value for the patient (better functioning of the prosthesis, better acceptability by the patients, improved life span, etc.), they have several limitations: complexity of use, high cost, and they do not fully meet the requirements in terms of accuracy.
This is the context of the ANR PRCE MARSurg project – Markerless Augmented Reality for the Future Orthopedic Surgery, which aims to develop an innovative surgical navigation solution with high scientific, technological and clinical potential. This platform will be based on the use of Augmented Reality (AR) and methods of computer vision and Artificial Intelligence (machine learning), to estimate the geometric and kinematic parameters of the joints and restore them, in real time, to the surgeon during surgery.
The objectives
In this context, several technological, scientific and clinical objectives are targeted in the framework of MARSurg. These objectives include:
- Implement a new surgical protocol for total knee joint replacement by guaranteeing less invasiveness while improving the functionalities of the replacement prosthesis (stability, life span, etc.) ;
- Develop a new pre-industrial system including an augmented reality software platform that will intuitively provide all the information the surgeon needs during the surgical procedure;
- Improve state-of-the-art methods on 3D pose calculation without artificial markers using geometric and artificial intelligence methods;
- Advance state-of-the-art methods on the segmentation and registration of 3D images from so-called RGB-D cameras (a camera that simultaneously provides a color image and a depth map characterizing the distance of the objects seen in the image), particularly in the context of clinical applications;
- Accelerate the industrial transfer of the methods developed to make Pixee Medical a world leader in orthopedic surgery.
The results
In summary, the MARSurg project aims to develop a generic software platform for orthopedic surgery (beyond knee surgery) by targeting the replacement of failing joints with artificial joint prostheses. This will be achieved by addressing several scientific disciplines, such as visual perception, using depth cameras, computer vision, artificial intelligence, software engineering and augmented reality. A final demonstrator of the augmented reality platform that will be developed will be tested and evaluated in conditions close to those of an operating room, in the presence of specialist surgeons.
Partnerships and collaborations
The 4-year project is coordinated by Brahim Tamadazte, CNRS Research Fellow and member of ISIR, Sorbonne University. The project consortium is also composed of:
- Inria Rennes Bretagne-Atlantique, represented by Eric Marchand, University Professor at Rennes 1,
- and Pixee Medical, a French company specialized in the development of innovative solutions for knee surgery, represented by Anthony Agustinos, R&D manager.
VirtuAlz project – “Virtual Patient” simulation training tool for professionals in the health and medico-social sectors working with people suffering from Alzheimer’s disease or related disorders.
How can caregivers, especially nurses and care assistants, overcome the difficulties of interacting with Alzheimer’s patients on a daily basis? How can they be trained in the right gestures and appropriate communication at each stage of the disease? How can professional practice be improved through simulation? The VirtuAlz project was born in response to this training need expressed by health professionals.
The VirtuAlz serious game is designed to provide geriatric professionals with a training module in clinical reasoning and non-verbal communication skills by proposing various scenarios of frequently encountered critical situations and delicate management (e.g. refusal to take medication, wandering), in a secure setting.
The context
The communication skills of people with Alzheimer’s disease or a related disorder (ADRD) deteriorate over time, as memory loss and/or impairment of certain cognitive functions worsen. These disorders impair the quality of their relationships with their carers and caregivers, and this dimension is apparently not taken into account in the initial and ongoing training of healthcare professionals.
Insufficient preparation, particularly in non-verbal communication to which these patients are very sensitive, is the cause of many difficulties encountered by carers today. Training for health professionals in contact with and in charge of people with cognitive disorders should include not only theoretical but also practical knowledge to communicate effectively with these patients and to manage complex situations while respecting their safety and dignity.
These skills, which are essential for health professionals and staff in charge of people with Alzheimer’s disease, are currently insufficiently mastered and rarely taught, either in initial or in-service training.
The objectives
The objective of the VirtuAlz project was to design, develop and evaluate a virtual patient (VP) that could simulate Alzheimer’s disease symptoms, both verbally and non-verbally (facial expression, posture, movements). The project was based on the analysis of activity in work situations for training, the scripting of relational digital simulation with a virtual patient and the automated, real-time interpretation of the learner’s behaviour based on social signals (movements, gestures, facial expressions, interpersonal distance).
The results
In the pre-tests, we evaluated the ergonomic qualities of the device and the proposed scenarios using interviews and questionnaires with health professionals. This stage allowed us to make the necessary technical modifications to the device.
Then, we conducted two waves of experiments, including two scenarios (“taking medication” and “walking around”) with healthcare professionals and showed that the Virtualz device allowed to interact with a usable and acceptable virtual patient with a good level of realism according to these professionals.
We were able to develop the generation of virtual patient behaviours such as verbal (synthesised voice) and non-verbal (body and head movements, gaze direction, facial expressions) mimicking an elderly patient with signs of Alzheimer’s disease (apathy, memory loss, agitation, aggression or refusal of care). The trainee could interact in natural language with the virtual patient through a Wizard of Oz simulation. The platform developed analyses the video stream and transmits in real time a sequence of symbols describing the non-verbal behaviour of the health professional to the other computer modules of the VirtuALZ project.
A key aspect of the Virtualz serious game was the automatic evaluation of non-verbal behaviours (facial expressions, proxemia, facial touch, movements, postures) of learners captured during interaction with the virtual patient.
Finally, we examined the conditions for implementing the device in the training of professionals. The virtual patient device created lays the foundations for a base of varied training modules in all types of context.
Partnerships and collaborations
Supported by the French National Research Agency (ANR) as part of the Life, health and wellbeing Challenge, the Virtualz project involved the Assistance Publique-Hôpitaux de Paris (APHP, coordinator), the Institute of Intelligent Systems and Robotics (ISIR), the Computer Science for Mechanics and Engineering Sciences laboratory (LIMSI), the Lille Interuniversity Education Research Centre (CIREL), and the company SimForHealth (Interaction Healthcare). It lasted 48 months (May 2018 to May 2022).