2022 |
---|
Journal Papers |
Sam Broida, Mariah Schrum, Eric Yoon, Aidan Sweeney, Neil Dhruv, Matthew Gombolay, and Sangwook Yoon Improving Surgical Triage in Spine Clinic: Predicting Likelihood of Surgery Using Machine Learning | Abstract Background Correctly triaging patients to a surgeon or non-operative provider is an important part of the referral process. Clinics typically triage new patients based on simple intake questions. This is time-consuming and does not incorporate objective data. Our goal was to use machine learning to more accurately screen surgical candidates seen in spine clinic. Methods Using questionnaire data and MRI reports, a set of artificial neural networks were trained to predict whether a patient would be recommended for spine surgery. Questionnaire responses included demographics, chief complaint, and pain characteristics. The primary endpoint was the surgeon’s determination of whether a patient was an operative candidate. Model accuracy in predicting this endpoint was assessed using a separate subset of patients apart from the training data. Results The retrospective dataset included 1,663 cervical and lumbar patients. Questionnaire data was available for all participants and MRI reads were available for 242 patients. Within six months of initial evaluation, 717 (43.1%) patients were deemed surgical candidates by the surgeon. Our models predict surgeons’ recommendations with AUC scores of 0.686 for lumbar (PPV 66%, NPV 80%) and 0.821 for cervical (PPV 83%, NPV 85%) patients. Conclusions Our models use patient data to accurately predict whether patients will receive a surgical recommendation. The models’ high NPV demonstrate that this approach can reduce the burden of non-surgical patients in surgery clinic without losing many surgical candidates. This could reduce unnecessary visits for patients, increase the proport In World Neurosurgery. [To Appear] |
Dean D. Molinaro, Inseung Kang, Jonathan Camargo, Matthew Gombolay and Aaron Young Subject-Independent, Biological Hip Moment Estimation during Multimodal Overground Ambulation using Deep Learning In IEEE Transactions on Biomedical Engineering (TBME). [To Appear] |
Andrew Silva, Nina Moorman, William Silva, Zulfiqar Zaidi, Nakul Gopalan, and Matthew Gombolay LanCon-Learn: Learning with Language to Enable Generalization in Multi-Task Manipulation | Abstract Robots must be capable of learning from previously solved tasks and generalizing that knowledge to quickly perform new tasks to realize the vision of ubiquitous and useful robot assistance in the real world. While multi-task learning research has produced agents capable of performing multiple tasks, these tasks are often encoded as one-hot goals. In contrast, natural language specifications offer an accessible means both for (1) users to describe a set of new tasks to the robot and (2) robots to reason about the similarities and differences among tasks through language-based task embeddings. Until now, multi-task learning with language has been limited to navigation based tasks and has not been applied to continuous manipulation tasks, requiring precision to grasp and move objects. We present LanCon-Learn, a novel attention-based approach to language-conditioned multi-task learning in manipulation domains to enable learning agents to reason about relationships between skills and task objectives through natural language and interaction. We evaluate LanCon-Learn for both reinforcement learning and imitation learning, across multiple virtual robot domains along with a demonstration on a physical robot. LanCon-Learn achieves up to a 200% improvement in zero-shot task success rate and transfers known skills to novel tasks faster than non-language-based baselines, demonstrating the utility of language for goal specification. In. IEEE Robotics and Automation Letters, Volume 7, Issue 2, pages 1635-1642. |
Conference Papers |
Andrew Hundt*, William Agnew*, Vicky Zang, Severin Kacianka, and Matthew Gombolay Robots Enact Malignant Stereotypes | Abstract Stereotypes, bias, and discrimination have been extensively documented in Machine Learning (ML) methods such as Computer Vision (CV) [18, 80], Natural Language Processing (NLP) [6], or both, in the case of large image and caption models such as OpenAI CLIP [14]. In this paper, we evaluate how ML bias manifests in robots that physically and autonomously act within the world. We audit one of several recently published CLIP-powered robotic manipulation methods, presenting it with objects that have pictures of human faces on the surface which vary across race and gender, alongside task descriptions that contain terms associated with common stereotypes. Our experiments definitively show robots acting out toxic stereotypes with respect to gender, race, and scientifically-discredited physiognomy, at scale. Furthermore, the audited methods are less likely to recognize Women and People of Color. Our interdisciplinary sociotechnical analysis synthesizes across fields and applications such as Science Technology and Society (STS), Critical Studies, History, Safety, Robotics, and AI. We find that robots powered by large datasets and Dissolution Models (sometimes called “foundation models”, e.g. CLIP) that contain humans risk physically amplifying malignant stereotypes in general; and that merely correcting disparities will be insufficient for the complexity and scale of the problem. Instead, we recommend that robot learning methods that physically manifest stereotypes or other harmful outcomes be paused, reworked, or even wound down when appropriate, until outcomes can be proven safe, effective, and just. Finally, we discuss comprehensive policy changes and the potential of new interdisciplinary research on topics like Identity Safety Assessment Frameworks and Design Justice to better understand and address these harms. In Proc. ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT). |
Roger Dias, Lauren Kennedy-Metz, Steven Yule, Matthew Gombolay, and Marco Zenati Assessing Team Situational Awareness in the Operating Room via Computer Vision | Abstract Situational awareness (SA) at both individual and team levels, plays a critical role in the operating room (OR). During the pre-incision time-out, the entire OR team comes together to deploy the surgical safety checklist (SSC). Worldwide, the implementation of the SSC has been shown to reduce intraoperative complications and mortality among surgical patients. In this study, we investigated the feasibility of applying computer vision analysis on surgical videos to extract team motion metrics that could differentiate teams with good SA from those with poor SA during the pre-incision time-out. We used a validated observation-based tool to assess SA, and a computer vision software to measure body position and motion patterns in the OR. Our findings showed that it is feasible to extract surgical team motion metrics captured via off-the-shelf OR cameras. Entropy as a measure of the level of team organization was able to distinguish surgical teams with good and poor SA. These findings corroborate existing studies showing that computer vision-based motion metrics have the potential to integrate traditional observation-based performance assessments in the OR. In Proc. Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA). |
Rohan Paleja*, Yaru Niu*, Andrew Silva, Chace Ritchie, Sugju Choi, and Matthew Gombolay Learning Interpretable, High-Performing Policies for Autonomous Driving | Abstract Gradient-based approaches in reinforcement learning have achieved tremendous success in learning policies for autonomous vehicles. While the performance of these approaches warrants real-world adoption, these policies lack interpretability, limiting deployability in the safety-critical and legally-regulated domain of autonomous driving (AD). AD requires interpretable and verifiable control policies that maintain high performance. We propose Interpretable Continuous Control Trees (ICCTs), a tree-based model that can be optimized via modern, gradient-based, RL approaches to produce high-performing, interpretable policies. The key to our approach is a procedure for allowing direct optimization in a sparse decision-tree-like representation. We validate ICCTs against baselines across six domains, showing that ICCTs are capable of learning interpretable policy representations that parity or outperform baselines by up to 33% in AD scenarios while achieving a 300x-600x reduction in the number of policy parameters against deep learning baselines. Furthermore, we demonstrate the interpretability and utility of our ICCTs through a 14-car physical robot demonstration. In Proc. Robotics: Science and Systems (RSS). |
Nakul Gopalan, Nina Moorman, Manisha Natarajan, and Matthew Gombolay Negative Result for Learning from Demonstration: Challenges for End-Users Teaching Robots with Task And Motion Planning Abstractions | Abstract Learning from demonstration (LfD) seeks to democratize robotics by enabling non-experts to intuitively program robots to perform novel skills through human task demonstration. Yet, LfD is challenging under a task and motion planning setting which requires hierarchical abstractions. Prior work has studied mechanisms for eliciting demonstrations that include hierarchical specifications of task and motion, via keyframes [1] or hierarchical task network specifications [2]. However, such prior works have not examined whether non-roboticist end-users are capable of providing such hierarchical demonstrations without explicit training from a roboticist showing how to teach each task [3]. To address the limitations and assumptions of prior work, we conduct two novel human-subjects experiments to answer (1) what are the necessary conditions to teach users through hierarchy and task abstractions and (2) what instructional information or feedback is required to support users to learn to program robots effectively to solve novel tasks. Our first experiment shows that fewer than half (35.71%) of our subjects provide demonstrations with sub-task abstractions when not primed. Our second experiment demonstrates that users fail to teach the robot correctly when not shown a video demonstration of an expert’s teaching strategy for the exact task that the subject is training. Not even showing the video of an analogue task was sufficient. These experiments reveal the need for fundamentally different approaches in LfD which can allow end-users to teach generalizable long-horizon tasks to robots without the need to be coached by experts at every step. In Proc. Robotics: Science and Systems (RSS). |
Zheyuan Wang and Matthew Gombolay Stochastic Resource Optimization over Heterogeneous Graph Neural Networks for Failure-Predictive Maintenance Scheduling | Abstract Resource optimization for predictive maintenance is a challenging computational problem that requires inferring and reasoning over stochastic failure models and dynamically allocating repair resources. Predictive maintenance scheduling is typically performed with a combination of ad hoc, handcrafted heuristics with manual scheduling corrections by human domain experts, which is a labor-intensive process that is hard to scale. In this paper, we develop an innovative heterogeneous graph neural network to automatically learn an end-to-end resource scheduling policy. Our approach is fully graph-based with the addition of state summary and decision value nodes that provides a computationally lightweight and nonparametric means to perform dynamic scheduling. We augment our policy optimization procedure to enable robust learning in highly stochastic environments for which typical actor-critic reinforcement learning methods are ill-suited. In consultation with aerospace industry partners, we develop a virtual predictive-maintenance environment for a heterogeneous fleet of aircraft, called AirME. Our approach sets a new state-of-the-art by outperforming conventional, hand-crafted heuristics and baseline learning methods across problem sizes and various objective functions. In Proc. International Conference on Automated Planning and Scheduling (ICAPS). [31% Acceptance Rate] |
Mariah Schrum*, Erin Hedlund-Botti*, Nina Moorman, and Matthew Gombolay MIND MELD: Personalized Meta-Learning for Robot-Centric Imitation Learning | Abstract Learning from demonstration (LfD) techniques seek to enable users without computer programming experience to teach robots novel tasks. There are generally two types of LfD: human- and robot-centric. While human-centric learning is intuitive, human centric learning suffers from performance degradation due to covariate shift. Robot-centric approaches, such as Dataset Aggregation (DAgger), address covariate shift but can struggle to learn from suboptimal human teachers. To create a more human-aware version of robot-centric LfD, we present Mutual Information-driven Meta-learning from Demonstration (MIND MELD). MIND MELD meta-learns a mapping from suboptimal and heterogeneous human feedback to optimal labels, thereby improving the learning signal for robot-centric LfD. The key to our approach is learning an informative personalized embedding using mutual information maximization via variational inference. The embedding then informs a mapping from human provided labels to optimal labels. We evaluate our framework in a human-subjects experiment, demonstrating that our approach improves corrective labels provided by human demonstrators. Our framework outperforms baselines in terms of ability to reach the goal (p < .001), average distance from the goal (p = .006), and various subjective ratings (p = .008). In Proc. International Conference on Human-Robot Interaction (HRI). [25% Acceptance Rate] [Best Paper Award] |
Sachin Konan*, Esmaeil Seraj*, and Matthew Gombolay Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming | Abstract Information sharing is key in building team cognition and enables coordination and cooperation. High-performing human teams also benefit from acting strategically with hierarchical levels of iterated communication and rationalizability, meaning a human agent can reason about the actions of their teammates in their decision-making. Yet, the majority of prior work in Multi-Agent Reinforcement Learning (MARL) does not support iterated rationalizability and only encourage inter-agent communication, resulting in a suboptimal equilibrium cooperation strategy. In this work, we show that reformulating an agent’s policy to be conditional on the policies of its neighboring teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG). Building on the idea of decision-making under bounded rationality and cognitive hierarchy theory, we show that our modified PG approach not only maximizes local agent rewards but also implicitly reasons about MI between agents without the need for any explicit ad-hoc regularization terms. Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks. Our experiments validate the utility of InfoPG by achieving higher sample efficiency and significantly larger cumulative reward in several complex cooperative multi-agent domains. In Proc. Conference on Learning Representations (ICLR). [32% Acceptance Rate] |
Andrew Silva, Rohit Chopra, and Matthew Gombolay Cross-Loss Influence Functions to Explain Deep Network Representations | Abstract As machine learning is increasingly deployed in the real world, it is ever more vital that we understand the decision-criteria of the models we train. Recently, researchers have shown that influence functions, a statistical measure of sample impact, may be extended to approximate the effects of training samples on classification accuracy for deep neural networks. However, prior work only applies to supervised learning setups where training and testing share an objective function. Despite the rise in unsupervised learning, self-supervised learning, and model pre-training, there are currently no suitable technologies for estimating influence of deep networks that do not train and test on the same objective. To overcome this limitation, we provide the first theoretical and empirical demonstration that influence functions can be extended to handle mismatched training and testing settings. Our result enables us to compute the influence of unsupervised and self-supervised training examples with respect to a supervised test objective. We demonstrate this technique on a synthetic dataset as well as two Skipgram language model examples to examine cluster membership and sources of unwanted bias. In Proc. Conference on Aritifcial Intelligence and Statistics (AISTATS). [29% Acceptance Rate] |
Esmaeil Seraj*, Zheyuan Wang*, Rohan Paleja*, Daniel Martin, Matthew Sklar, and Matthew Gombolay Learning Efficient Diverse Communication for Cooperative Heterogeneous Teaming | Abstract High-performing teams learn intelligent and efficient communication and coordination strategies to maximize their joint utility. These teams implicitly understand the different roles of heterogeneous team members and adapt their communication protocols accordingly. Multi-Agent Reinforcement Learning (MARL) seeks to develop computational methods for synthesizing such coordination strategies, but formulating models for heterogeneous teams with different state, action, and observation spaces has remained an open problem. Without properly modeling agent heterogeneity, as in prior MARL work that leverages homogeneous graph networks, communication becomes less helpful and can even deteriorate the cooperativity and team performance. We propose Heterogeneous Policy Networks (HetNet) to learn efficient and diverse communication models for coordinating cooperative heterogeneous teams. Building on heterogeneous graph-attention networks, we show that HetNet not only facilitates learning heterogeneous collaborative policies per existing agent-class but also enables end-to-end training for learning highly efficient binarized messaging. In Proc. Autonomous Agents and Multiagent Systems (AAMAS). [26% Acceptance Rate] |
Workshop/Symposium Papers and Doctoral Consortia |
Andrew Hundt*, William Agnew*, Vicky Zang, Severin Kacianka, and Matthew Gombolay Robots Enact Malignant Stereotypes | Abstract Stereotypes, bias, and discrimination have been extensively documented in Machine Learning (ML) methods such as Computer Vision (CV) [18, 80], Natural Language Processing (NLP) [6], or both, in the case of large image and caption models such as OpenAI CLIP [14]. In this paper, we evaluate how ML bias manifests in robots that physically and autonomously act within the world. We audit one of several recently published CLIP-powered robotic manipulation methods, presenting it with objects that have pictures of human faces on the surface which vary across race and gender, alongside task descriptions that contain terms associated with common stereotypes. Our experiments definitively show robots acting out toxic stereotypes with respect to gender, race, and scientifically-discredited physiognomy, at scale. Furthermore, the audited methods are less likely to recognize Women and People of Color. Our interdisciplinary sociotechnical analysis synthesizes across fields and applications such as Science Technology and Society (STS), Critical Studies, History, Safety, Robotics, and AI. We find that robots powered by large datasets and Dissolution Models (sometimes called “foundation models”, e.g. CLIP) that contain humans risk physically amplifying malignant stereotypes in general; and that merely correcting disparities will be insufficient for the complexity and scale of the problem. Instead, we recommend that robot learning methods that physically manifest stereotypes or other harmful outcomes be paused, reworked, or even wound down when appropriate, until outcomes can be proven safe, effective, and just. Finally, we discuss comprehensive policy changes and the potential of new interdisciplinary research on topics like Identity Safety Assessment Frameworks and Design Justice to better understand and address these harms. In Proc. RSS 2022 Workshop on Learning from Diverse, Offline Data (L-DOD). |
Max Zuo*, Logan Schick*, Matthew Gombolay*, and Nakul Gopalan* Efficient Exploration via First-Person Behavior Cloning Assisted Rapidly-Exploring Random Trees | Abstract Modern day computer games have extremely large state and action spaces. To detect bugs in these games’ models, human testers play the games repeatedly to explore the game and find errors in the games. Such game play is exhaustive and time consuming. Moreover, since robotics simulators depend on similar methods of model specification and debugging, the problem of finding errors in the model is of interest for the robotics community to ensure robot behaviors and interactions are consistent in simulators. Previous methods have used reinforcement learning [8] and search based methods [6] including Rapidly-exploring Random Trees (RRT) to explore a game’s state-action space to find bugs. However, such search and exploration based methods are not efficient at exploring the state-action space without a pre-defined heuristic. In this work we attempt to combine a human-tester’s expertise in solving games, and the exhaustiveness of RRT to search a game’s state space efficiently with high coverage. This paper introduces humanseeded RRT (HS-RRT) and behavior-cloning-assisted RRT (CARRT) in testing the number of game states searched and the time taken to explore those game states. We compare our methods to an existing weighted RRT [18] baseline for game exploration testing studied. We find HS-RRT and CA-RRT both explore more game states in fewer tree expansions/iterations when compared to the existing baseline. In each test, CA-RRT reached more states on average in the same number of iterations as RRT. In our tested environments, CA-RRT was able to reach the same number of states as RRT by more than 5000 fewer iterations on average, almost a 50% reduction. In Proc. HRI 2022 Workshop on Machine Learning in Human-Robot Collaboration (MLHRC). |
Mariah Schrum, Erin Hedlund-Botti, and Matthew Gombolay Towards Improving Life-Long Learning Via Personalized, Reciprocal Teaching | Abstract In a world with ubiquitous robots, robots will need to be personalizable and capable of learning novel tasks from humans throughout their deployment. However, research has shown that humans can be poor teachers, making it difficult for robots to effectively learn from humans. In prior work, we introduced Mutual Information Driven Meta-Learning from Demonstration (MIND MELD), which learns to map suboptimal human demonstrations to higher-quality demonstrations. While this work effectively accounts for suboptimality on novel tasks within a set distribution of calibration tasks, MIND MELD does not convey to the demonstrator the way in which the demonstrator is suboptimal. If the human could learn how to provide better demonstrations, then the human might be able to effectively teach a broader range of novel, out-of-distribution tasks where MIND MELD does not readily account for potential demonstration suboptimality. In this work, we introduce Reciprocal MIND MELD, a framework in which the robot learns the way in which a demonstrator is suboptimal and utilizes this information to provide feedback to the demonstrator to improve their demonstrations long-term. In a human-subjects experiment, we demonstrate that the robot can effectively improve how a human provides feedback (p < .001). Additionally, we show that humans trust the robot more (p = .014) and feel more team fluency when the robot provides helpful advice (p = .014). In Proc. HRI 2022 Workshop on Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI). |
David Fernandez, Guillermo Grande, Sean Ye, Nakul Gopalan, and Matthew Gombolay Interactive Learning with Natural Language Feedback | Abstract We seek to enable non-roboticist end-users to teach robots by examining how end-users can provide natural language feedback to the robot. We hypothesized that enabling users to use language to train an agent would be more intuitive as user’s don’t have to translate their intent through another system. We build upon Deep TAMER to allow users to provide feedback through natural language to a learning agent. Our algorithm includes (1) a Transformer based language model to map natural language feedback to scalar reward values and (2) a method to synthetically assign rewards to nearby state-action pairs that were unexplored by the agent. We report our results from a 2×4 mixed-subjects experiment design to evaluate the usability, workload, and trainablitity of our system compared to Deep TAMER on simulated tasks. While the experimenters were able to train an agent on both simulated environments to achieve competitive rewards, we could not show that natural language feedback significantly lowered workload, increase usability, or train better agents than baseline Deep TAMER with human subjects. This work indicates a need for further research into the types of feedback end-users prefer to use to train agents. In Proc. HRI 2022 Workshop on Participatory Design and End-User Programming for Human-Robot Interaction (PD/EUP). |
Esmaeil Seraj and Matthew Gombolay Embodied Team Intelligence in Multi-Robot Systems | Abstract High-performing human teams leverage intelligent and efficient communication and coordination strategies to collaboratively maximize their joint utility. Inspired by teaming behaviors among humans, I seek to develop computational methods for synthesizing intelligent communication and coordination strategies for collaborative multi-robot systems. I leverage both classical model-based control and planning approaches as well as data-driven methods such as Multi-Agent Reinforcement Learning (MARL) to provide several contributions towards enabling emergent cooperative teaming behavior across both homogeneous and heterogeneous (including agents with different capabilities) robot teams. In future work, I aim to investigate efficient ways to incorporate humans’ teaming strategies for robot teams and directly learn team coordination policies from human experts. In Proc. Autonomous Agents and Multiagent Systems (AAMAS) Doctoral Consortium. |
Rohan Paleja and Matthew Gombolay Mutual Understanding in Human-Machine Teaming | Abstract Collaborative robots (i.e., “cobots”) and machine learning-based virtual agents are increasingly entering the human workspace with the aim of increasing productivity, enhancing safety, and improving the quality of our lives. These agents will dynamically interact with a wide variety of people in dynamic and novel contexts, increasing the prevalence of human-machine teams in healthcare, manufacturing, and search-and-rescue. In this research, we enhance the mutual understanding within a human-machine team by enabling cobots to understand heterogeneous teammates via person-specific embeddings, identifying contexts in which xAI methods can help improve team mental model alignment, and enabling cobots to effectively communicate information that supports high-performance human-machine teaming. In Proc. Association for the Advancement of Artificial Intelligence Conference (AAAI) Doctoral Consortium. |
Andrew Silva and Matthew Gombolay Empirically Evaluating Meta Learning of Robot Explainability with Humans | Abstract As physically-embodied robots and digital assistants are deployed in the real world, these agents must be able to communicate their decision-making criteria to build trust, improve human-robot teaming, and enable collaboration. While the field of explainable machine learning has made great strides in building a set of mechanisms to enable such communication, these advancements often assume that one approach is ideally suited to one problem (e.g., decision trees are best for explaining how to triage patients in an emergency room), failing to recognize that individual users may have different past experiences or preferences. In this work, we present the design of a user study to evaluate a novel approach to personalization of robot explainability through meta-learning with humans. Our study will be the first to evaluate meta learning with humans in the loop and with multiple approaches to robot explainability. Our results will help to pave the way for academic and industry deployments of explainable machine learning to diverse user populations. In Proc. HRI 2022 Workshop Your Study Design (WYSD) Workshop. |
Sravan Jayanthi*, Letian Chen*, and Matthew Gombolay Strategy Discovery and Mixture in Lifelong Learning from Heterogeneous Demonstration | Abstract Learning from Demonstration (LfD) approaches empower end-users to teach robots novel tasks via demonstrations of the desired behaviors, democratizing access to robotics. A key challenge in LfD research is that users tend to provide heterogeneous demonstrations for the same task due to various strategies and preferences. Therefore, it is essential to develop LfD algorithms that ensure flexibility (the robot adapts to personalized strategies), efficiency (the robot achieves sample-efficient adaptation requiring only a few demonstrations by the user), and scalability (robot reuses a concise set of strategies to represent a large amount of behaviors). In this paper, we propose a novel algorithm, Dynamic Multi-Strategy Reward Distillation (DMSRD), which distills common knowledge between heterogeneous demonstrations, leverages learned strategies to construct mixtures policies, and continues to improve by learning from all available data. Our personalized, federated, and lifelong LfD architecture surpasses benchmarks in two continuous control problems with an average 62% improvement in policy returns, 50% improvement in log likelihood, and 36% decrease in the estimated KL divergence between learned behavior and demonstrations, alongside stronger task reward correlation and more precise strategy rewards. In Proc. 2022 AAAI Interactive Machine Learning workshop. |
2021 | Journal Papers |
Zheyuan Wang, Chen Liu, and Matthew Gombolay Heterogeneous Graph Attention Networks for Scalable Multi-Robot Scheduling with Temporospatial Constraints | Abstract Robot teams are increasingly being deployed in environments, such as manufacturing facilities and warehouses, to save cost and improve productivity. To efficiently coordinate multi-robot teams, fast, high-quality scheduling algorithms are essential to satisfy the temporal and spatial constraints imposed by dynamic task specification and part and robot availability. Traditional solutions include exact methods, which are intractable for large-scale problems, or application-specific heuristics, which require expert domain knowledge to develop. In this paper, we propose a novel heterogeneous graph attention network model, called ScheduleNet, to learn scheduling policies that overcome the limitations of conventional approaches. By introducing robot- and proximity-specific nodes into the simple temporal network encoding temporal constraints, we obtain a heterogeneous graph structure that is nonparametric in the number of tasks, robots and task resources or locations. We show that our model is end-to-end trainable via imitation learning on small-scale problems, and generalizes to large, unseen problems. Empirically, our method outperforms the existing state-of-the-art methods in a variety of testing scenarios involving both homogeneous and heterogeneous robot teams. In. Autonomous Robots. |
Esmaeil Seraj, Letian Chen, and Matthew Gombolay A Hierarchical Coordination Framework for Joint Perception-Action Tasks in Composite Robot Teams | Abstract We propose a collaborative planning and control algorithm to enhance cooperation for composite teams of autonomous robots in dynamic environments. Composite robot teams are groups of agents that perform different tasks according to their respective capabilities in order to accomplish an overarching mission. Examples of such teams include groups of perception agents (can only sense) and action agents (can only manipulate) working together to perform disaster response tasks. Coordinating robots in a composite team is a challenging problem due to the heterogeneity in the robots’ characteristics and their tasks. Here, we propose a coordination framework for composite robot teams. The proposed framework consistsof two hierarchical modules: (1) a Multi-Agent State-Action-Reward-Time-State-Action (MA-SARTSA) algorithm in Multi-Agent Partially Observable Semi-Markov Decision Process (MA-POSMDP) as the high-level decision-making module to enable perception agents to learn to surveil in an environment with an unknown number of dynamic targets and (2) a low-level coordinated control and planning module that ensures probabilistically-guaranteed support for action agents. Simulation and physical robot implementations of our algorithms on a multi-agent robot testbed demonstrated the efficacy and feasibility of our coordination framework by reducing the overall operation times in a benchmark wildfire-fighting case-study. In. IEEE Transactions on Robotics. |
Ruisen Liu, Manisha Natarajan, and Matthew Gombolay Coordinating Human-Robot Teams with Dynamic and Stochastic Task Proficiencies | Abstract As robots become ubiquitous in the workforce, it is essential that human-robot collaboration be both intuitive and adaptive. A robot’s ability to coordinate team activities improves based on its ability to infer and reason about the dynamic (i.e., the “learning curve”) and stochastic task performance of its human counterparts. We introduce a novel resource coordination algorithm thatenables robots to schedule team activities by 1) actively characterizing the task performance of their human teammates and 2) ensuring the schedule is robust to temporal constraints given this characterization. We first validate our modeling assumptions via user study. From this user study, we create a data-driven prior distribution over human task performance for our virtual and physical evaluations of human-robot teaming. Second, we show that our methods are scalable and produce high-quality schedules. Third, we conduct a between-subjects experiment (n=90) to assess the effects on a human-robot team of a robot scheduler actively exploring the humans’ task proficiency. Our results indicate that human-robot working alliance (p<0.001) and human performance (p=0.00359) are maximized when the robot dedicates more time to exploring the capabilities of human teammates. In. ACM Transactions on Human-Robot Interaction (THRI), Volume 11, Issue 1, pages 1-42. |
Conference Papers |
Rohan Paleja, Muyleng Ghuy, Nadun Ranawaka, Reed Jensen, and Matthew Gombolay The Utility of Explainable AI in Ad Hoc Human-Machine Teaming | Abstract Recent advances in machine learning have led to growing interest in Explainable AI (xAI) to enable humans to gain insight into the decision-making of machine learning models. Despite this recent interest, the utility of xAI techniques has not yet been characterized in human-machine teaming. Importantly, xAI offers the promise of enhancing team situational awareness (SA) and shared mental model development, which are the key characteristics of effective human-machine teams. Rapidly developing such mental models is especially critical in ad hoc human-machine teaming, where agents do not have a priori knowledge of others’ decision-making strategies. In this paper, we present two novel human-subject experiments quantifying the benefits of deploying xAI techniques within a human-machine teaming scenario. First, we show that xAI techniques can support SA ($p<0.05)$. Second, we examine how different SA levels induced via a collaborative AI policy abstraction affect ad hoc human-machine teaming performance. Importantly, we find that the benefits of xAI are not universal, as there is a strong dependence on the composition of the human-machine team. Novices benefit from xAI providing increased SA ($p<0.05$) but are susceptible to cognitive overhead ($p<0.05$). On the other hand, expert performance degrades with the addition of xAI-based support ($p<0.05$), indicating that the cost of paying attention to the xAI outweighs the benefits obtained from being provided additional information to enhance SA. Our results demonstrate that researchers must deliberately design and deploy the right xAI techniques in the right scenario by carefully considering human-machine team composition and how the xAI method augments SA. In Proc. Conference on Neural Information Processing Systems (NeurIPS). [26% Acceptance Rate] |
Elias Stengel-Eskin*, Andrew Hundt*, Zhuohong He, Aditya Murali, Nakul Gopalan, Matthew Gombolay, and Gregory Hager Guiding Multi-Step Rearrangement Tasks with Natural Language Instructions | Abstract Enabling human operators to interact with robotic agents using natural language would allow non-experts to intuitively instruct these agents. Towards this goal, we propose a novel Transformer-based model which enables a user to guide a robot arm through a 3D multi-step manipulation task with natural language commands. Our system maps images and commands to masks over grasp or place locations, grounding the language directly in perceptual space. In a suite of block rearrangement tasks, we show that these masks can be combined with an existing manipulation framework without re-training, greatly improving learning efficiency. Our masking model is several orders of magnitude more sample efficient than typical Transformer models, operating with hundreds, not millions, of examples. Our modular design allows us to leverage supervised and reinforcement learning, providing an easy interface for experimentation with different architectures. Our model completes block manipulation tasks with synthetic commands 530% more often than a UNet-based baseline, and learns to localize actions correctly while creating a mapping of symbols to perceptual input that supports compositional reasoning. We provide a valuable resource for 3D manipulation instruction following research by porting an existing 3D block dataset with crowdsourced language to a simulated environment. Our method’s 25.3% absolute improvement in identifying the correct block on the ported dataset demonstrates its ability to handle syntactic and lexical variation. In Proc. Conference on Robot Learning (CoRL). [38% Acceptance Rate] |
Andrew Hundt*, Aditya Murali*, Priyanka Hubli, Ran Liu, Nakul Gopalan, Matthew Gombolay, and Gregory Hager “Good Robot! Now Watch This!”: Repurposing Reinforcement Learning for Task-to-Task Transfer | Abstract Modern Reinforcement Learning (RL) algorithms are not sample efficient to train on multi-step tasks in complex domains, impeding their wider deployment in the real world. We address this problem by leveraging the insight that RL models trained to complete one set of tasks can be re-purposed to complete related tasks when given just a handful of demonstrations. Based upon this insight, we propose See-SPOT-Run (SSR), a new computational approach to robot learn ing that enables a robot to complete a variety of real robot tasks in novel problem domains without task-specific training. SSR uses pretrained RL models to create vectors that represent model, task, and action relevance in demonstration and test scenes. SSR then compares these vectors via our Cycle Consistency Distance (CCD) metric to determine the next action to take. SSR completes 58% more task steps and 20% more trials than a baseline few-shot learning method that requires task-specific training. SSR also achieves a four order of magnitude improvement in compute efficiency and a 20% to three order of magnitude improvement in sample efficiency compared to the baseline and to training RL models from scratch. To our knowledge, we are the first to address multi-step tasks from demonstration on a real robot without task-specific training, where both the visual input and action space output are high dimensional. In Proc. Conference on Robot Learning (CoRL). [38% Acceptance Rate] |
Roger Dias, Marco Zenati, Geoff Rance, Rithy Srey, David Arney, Letian Chen, Rohan Paleja, Lauren Kennedy-Metz, and Matthew Gombolay Using Machine Learning to Predict Perfusionists’ Critical Decision-Making during Cardiac Surgery | Abstract The cardiac surgery operating room is a high-risk and complex environment in which multiple experts work as a team to provide safe and excellent care to patients. During the cardiopulmonary bypass phase of cardiac surgery, critical decisions need to be made and the perfusionists play a crucial role in assessing available information and taking a certain course of action. In this paper, we report the findings of a simulation-based study using machine learning to build predictive models of perfusionists’ decision-making during critical situations in the operating room (OR). Performing 30-fold cross-validation across 30 random seeds, our machine learning approach was able to achieve an accuracy of 78.2% (95% confidence interval: 77.8% to 78.6%) in predicting perfusionists’ actions, having access to only 148 simulations. The findings from this study may inform future development of computerised clinical decision support tools to be embedded into the OR, improving patient safety and surgical outcomes. In. Computer Methods in Biomechanics and Biomedical Engineering. |
Samuel E. Broida BS, Mariah L. Schrum BS, Eric Yoon MD, Aidan P. Sweeney MS, Neil N. Dhruv BS, Matthew C. Gombolay PhD, MS, Sangwook T. Yoon MD, PhD The Effects of a Robot’s Performance on Human Teachers for Learning from Demonstration Tasks | Abstract Determining a patient’s surgical candidacy is an important part of the clinical referral process. Incorrectly triaging a patient to a non-operative provider versus a surgeon can be frustrating and burdensome for the patient and physician. Many clinics rely on administrative staff to manually assign new patients to appropriate providers based on standard intake questions and decision trees. The aim of this study was to use machine learning to more accurately screen surgical candidates in spine clinic based on an intake questionnaire and MRI reports. Our deep learning model that uses patient intake forms and prior MRI reports is able to accurately predict whether or not a patient will receive a surgical recommendation. The negative predictive values of our models show promise as a mechanism to refer patients for nonsurgical management rather than evaluation by a surgeon. This model could help reduce the number of unnecessary visits for patients and increase the proportion of operative candidates who are seen by surgeons. In Proc. American Academy of Orthopedic Surgeons (AAOS). |
Erin Hedlund*, Michael Johnson*, and Matthew Gombolay The Effects of a Robot’s Performance on Human Teachers for Learning from Demonstration Tasks | Abstract Learning from Demonstration (LfD) algorithms seek to enable end-users to teach robots new skills through human demonstration of a task. Previous studies have analyzed how robot failure affects human trust, but not in the context of the human teaching the robot. In this paper, we investigate how human teachers react to robot failure in an LfD setting. We conduct a study in which participants teach a robot how to complete three tasks, using one of three instruction methods, while the robot is pre-programmed to either succeed or fail at the task. We find that when the robot fails, people trust the robot less (p<.001) and themselves less (p=.004) and they believe that others will trust them less (p<.001). Human teachers also have a lower impression of the robot and themselves (p<.001) and found the task more difficult when the robot fails (p<.001). Motion capture was found to be a less difficult instruction method than teleoperation (p=.016), while kinesthetic teaching gave the teachers the lowest impression of themselves compared to teleoperation (p=.017) and motion capture (p<.001). Importantly, a mediation analysis showed that people's trust in themselves is heavily mediated by what they think that others -- including the robot -- think of them (p<.001). These results provide valuable insights to improving the human-robot relationship for LfD. In Proc. International Conference on Human-Robot Interaction (HRI). [23% Acceptance Rate] |
Mariah Schrum*, Glen Neville*, Michael Johnson*, Nina Moorman, Rohan Paleja, Karen Feigh, and Matthew Gombolay Effects of Social Factors and Team Dynamics on Adoption of Collaborative Robot Autonomy | Abstract As automation becomes more prevalent, the fear of job loss due to automation increases. Workers may not be amenable to working with a robotic co-worker due to a negative perception of the technology. The attitudes of workers towards automation are influenced by a variety of complex and multi-faceted factors such as intention to use, perceived usefulness and other external variables. In an analog manufacturing environment, we explore how these various factors influence an individual’s willingness to work with a robot over a human co-worker in a collaborative Lego building task. We specifically explore how this willingness is affected by: 1) the level of social rapport established between the individual and his or her human co-worker, 2) the anthropomorphic qualities of the robot, and 3) factors including trust, fluency and personality traits. Our results show that a participant’s willingness to work with automation decreased due to lower perceived team fluency (p=0.045), rapport established between a participant and their co-worker (p=0.003), the gender of the participant being male (p=0.041), and a higher inherent trust in people (p=0.018). In Proc. International Conference on Human-Robot Interaction (HRI). [23% Acceptance Rate] |
Esmaeil Seraj*, Vahid Azimi*, Chaouki Abdallah, Seth Hutchinson and Matthew Gombolay Adaptive Leader-Follower Control for Multi-Robot Teams with Uncertain Network Structure | Abstract Traditionally-designed, centralized or decentralized control architectures typically rely on the availability of communication channels between neighboring robots as well as a known, static network structure to tightly coordinate their actions in order to achieve global consensus. Unfortunately, communication constraints and network disconnectivity are key bottlenecks in such approaches, leading to the failure of conventional centralized or decentralized networked controllers in achieving stability and global consensus. To overcome these limitations, we develop a centralized, coordinated-control structure for multi-robot teams with uncertain network structure. Our novel approach enables multi-robot teams to achieve consensus even with disconnected communication graphs. Leveraging model reference adaptive control framework and networked control architectures, we develop a coordinated leader-follower consensus controller capable of overcoming communication losses within the team, handling non-communicative robots, and compensating for environmental noise. We prove the stability of our controller and empirically validate our approach by analyzing the effects of reference graph structures and environmental noise on the performance of robot team for navigation tasks. Finally, we demonstrate our novel controller in a multi-robot testbed. In Proc. American Control Conference (ACC). |
Yaru Niu*, Rohan Paleja*, and Matthew Gombolay Multi-Agent Graph-Attention Communication and Teaming | Abstract High-performing teams learn effective communication strategies to judiciously share information and reduce the cost of communication overhead. Within multi-agent reinforcement learning, synthesizing effective policies requires reasoning about when to communicate, whom to communicate with, and how to process messages. We propose a novel multi-agent reinforcement learning algorithm, Multi-Agent Graph-attentIon Communication (MAGIC), with a graph-attention communication protocol in which we learn 1) a Scheduler to help with the problems of when to communicate and whom to address messages to, and 2) a Message Processor using Graph Attention Networks (GATs) with dynamic graphs to deal with communication signals. The Scheduler consists of a graph attention encoder and a differentiable attention mechanism, which outputs dynamic, differentiable graphs to the Message Processor, which enables the Scheduler and Message Processor to be trained end-to-end. We evaluate our approach on a variety of cooperative tasks, including Google Research Football. Our method outperforms baselines across all domains, achieving $\approx 10\%$ increase in reward in the most challenging domain. We also show MAGIC communicates $23.2\%$ more efficiently than the average baseline, is robust to stochasticity, and scales to larger state-action spaces. Finally, we demonstrate MAGIC on a physical, multi-robot testbed. In Proc. Autonomous Agents and Multiagent Systems (AAMAS). [25% Acceptance Rate] |
Andrew Silva and Matthew Gombolay Encoding Human Domain Knowledge to Warm Start Reinforcement Learning | Abstract Deep reinforcement learning has seen great success across a breadth of tasks, such as in game playing and robotic manipulation. However, the modern practice of attempting to learn tabula rasa disregards the logical structure of many domains and the wealth of readily available knowledge from domain experts that could help “warm start” the learning process. Further, learning from demonstration techniques are not yet efficient enough to infer this knowledge through sampling-based mechanisms in large state and action spaces. We present a new reinforcement learning architecture that can encode expert knowledge, in the form of propositional logic, directly into a neural, tree-like structure of fuzzy propositions amenable to gradient descent and show that our novel architecture is able to outperform reinforcement and imitation learning techniques across an array of reinforcement learning challenges. We further conduct a user study to solicit expert policies from a variety of humans and find that humans are able to specify policies that provide a higher quality reward both before and after training relative to baseline methods, demonstrating the utility of our approach. In Proc. Conference on Artificial Intelligence (AAAI). [21% Acceptance Rate] |
Laura Strickland, Charles Pippin, and Matthew Gombolay Learning to Steer Swarm-vs.-Swarm Engagements | Abstract UAVs are becoming increasingly commonplace, and with their growing popularity, the question of how to counter a swarm of UAVs operated by bad actors becomes more critical. In this paper, we explore the possibility of using a team of fixed-wing UAVs to counter an adversarial swarm of fixed-wing UAVs. To learn to coordinate counter-swarm tactics, we propose Situation-Dependent Option-action Evaluation (SDOE), a distributed and scalable actor-critic RL architecture. Our approach enables each UAV to evaluate options over a set of scripted tactics as well as the option to maneuver freely, allowing for emergent team behavior. A key to the scalability of our approach is a novel, distributed neural network architecture that enables agents to share situational awareness and select tactics in a pairwise fashion, allowing agents to choose who to coordinate with, when, and how regardless of the size of the swarm. We test agents trained with our approach in simulated engagements of up to 16-vs.-16 UAVs, and find that, even as the size of the engagement increases, the agents trained using SDOE against a greedy, non-coordinating tactic win engagements against a team of greedy agents more reliably than another team of greedy agents. In Proc. American Institute of Aeronautics and Astronautics (AIAA) SciTech 2021 Forum. |
Michael Johnson, Ruisen Liu, Nakul Gopalan, and Matthew Gombolay An Approach to Human-Robot Collaborative Drilling and Fastening in Aerospace Final Assembly | Abstract The aerospace manufacturing industry is increasingly adopting automated machinery to accomplish labor intensive tasks and to meet growing production demands. Traditionally, production floors are filled with fixed-installation robots that are not easily adapted to hanging needs. In this work, we present an approach to using a collaborative robot to complete drilling and fastening tasks that can adapt to new environments by leveraging a human operator and expert demonstrator. The human trains the robot to complete the task autonomously by defining its environment and providing the robot demonstrations on how to locate, classify, and insert fasteners into a fuselage. The system begins with no information and uses offline and online learning techniques to develop a data bank of relevant information to improve the insertion process within the workspace. We show the results of unit tests that evaluate the multiple steps to the learning-execution process and draw conclusions from our observations. In Proc. American Institute of Aeronautics and Astronautics (AIAA) SciTech 2021 Forum. |
Workshop/Symposium Papers |
Yaru Niu*, Rohan Paleja*, and Matthew Gombolay MAGIC: Multi-Agent Graph-Attention Communication | Abstract High-performing teams learn effective communication strategies to judiciously share information and reduce the cost of communication overhead. Within multi-agent reinforcement learning, synthesizing effective policies requires reasoning about when to communicate, whom to communicate with, and how to process messages. We propose a novel multi-agent reinforcement learning algorithm, Multi-Agent Graph-attentIon Communication (MAGIC), with a graph-attention communication protocol in which we learn 1) a Scheduler to help with the problems of when to communicate and whom to address messages to, and 2) a Message Processor using Graph Attention Networks (GATs) with dynamic graphs to deal with communication signals. The Scheduler consists of a graph attention encoder and a differentiable attention mechanism, which outputs dynamic, differentiable graphs to the Message Processor, which enables the Scheduler and Message Processor to be trained end-to-end. We evaluate our approach on a variety of cooperative tasks, including Google Research Football. Our method outperforms baselines across all domains, achieving $\approx 10\%$ increase in reward in the most challenging domain. We also show MAGIC communicates $23.2\%$ more efficiently than the average baseline, is robust to stochasticity, and scales to larger state-action spaces. Finally, we demonstrate MAGIC on a physical, multi-robot testbed. In Proc. ICCV 2021 Workshop on Multi-Agent Interaction and Relational Reasoning. [Spotlight Talk] [Best Paper Award] |
Vanya Cohen, Geraud Nangue Tasse, Nakul Gopalan, Steven James, Matthew Gombolay, and Benjamin Rosman Learning to Follow Language Instructions with Compositional Policies | Abstract We propose a framework that learns to execute natural language instructions in an environment consisting of goalreaching tasks that share components of their task descriptions. Our approach leverages the compositionality of both value functions and language, with the aim of reducing the sample complexity of learning novel tasks. First, we train a reinforcement learning agent to learn value functions that can be subsequently composed through a Boolean algebra to solve novel tasks. Second, we fine-tune a seq2seq model pretrained on web-scale corpora to map language to logical expressions that specify the required value function compositions. Evaluating our agent in the BabyAI domain, we observe a decrease of 86% in the number of training steps needed to learn a second task after mastering a single task. Results from ablation studies further indicate that it is the combination of compositional value functions and language representations that allows the agent to quickly generalize to new tasks. In Proc. AAAI Artificial Intelligence for Human-Robot Interaction (AI-HRI) Fall Symposium. |
Letian Chen, Rohan Paleja, and Matthew Gombolay Towards Sample-efficient Apprenticeship Learning from Suboptimal Demonstration | Abstract Learning from Demonstration (LfD) seeks to democratize robotics by enabling non-roboticist end-users to teach robots to perform novel tasks by providing demonstrations. However, as demonstrators are typically non-experts, modern LfD techniques are unable to produce policies much better than the suboptimal demonstration. A previously-proposed framework, SSRR, has shown success in learning from suboptimal demonstration but relies on noise-injected trajectories to infer an idealized reward function. A random approach such as noise-injection to generate trajectories has two key drawbacks: 1) Performance degradation could be random depending on whether the noise is applied to vital states and 2) Noise-injection generated trajectories may have limited suboptimality and therefore will not accurately represent the whole scope of suboptimality. We present Systematic Self-Supervised Reward Regression, S3RR, to investigate systematic alternatives for trajectory degradation. In Proc. AAAI Artificial Intelligence for Human-Robot Interaction (AI-HRI) Fall Symposium. |
Mariah Schrum, Erin Hedlund, and Matthew Gombolay Improving Robot-Centric Learning from Demonstration via Personalized Embeddings | Abstract Learning from demonstration (LfD) techniques seek to enable novice users to teach robots novel tasks in the real world. However, prior work has shown that robot-centric LfD approaches, such as Dataset Aggregation (DAgger), do not perform well with human teachers. DAgger requires a human demonstrator to provide corrective feedback to the learner either in real-time, which can result in degraded performance due to suboptimal human labels, or in a post hoc manner which is time intensive and often not feasible. To address this problem, we present Mutual Information-driven Metalearning from Demonstration (MIND MELD), which metalearns a mapping from poor quality human labels to predicted ground truth labels, thereby improving upon the performance of prior LfD approaches for DAgger-based training. The key to our approach for improving upon suboptimal feedback is mutual information maximization via variational inference. Our approach learns a meaningful, personalized embedding via variational inference which informs the mapping from human provided labels to predicted ground truth labels. We demonstrate our framework in a synthetic domain and in a human-subjects experiment, illustrating that our approach improves upon the corrective labels provided by a human demonstrator by 63%. In Proc. AAAI Artificial Intelligence for Human-Robot Interaction (AI-HRI) Fall Symposium. |
Andrew Silva*, Pradyumna Tambwekar*, and Matthew Gombolay Towards a Comprehensive Understanding and Accurate Evaluation of Societal Biases in Pre-Trained Transformers | Abstract The ease of access to pre-trained transformers has enabled developers to leverage large-scale language models to build exciting applications for their users. While such pre-trained models offer convenient starting points for researchers and developers, there is little consideration for the societal biases captured within these model risking perpetuation of racial, gender, and other harmful biases when these models are deployed at scale. In this paper, we investigate gender and racial bias across ubiquitous pre-trained language models, including GPT-2, XLNet, BERT, RoBERTa, ALBERT and DistilBERT. We evaluate bias within pre-trained transformers using three metrics: WEAT, sequence likelihood, and pronoun ranking. We conclude with an experiment demonstrating the ineffectiveness of word-embedding techniques, such as WEAT, signaling the need for more robust bias testing in transformers In Proc. North American Chapter of the Association for Computational Linguistics. |
Rohan Paleja, Andrew Silva, Letian Chen, and Matthew Gombolay Interpretable and Personalized Apprenticeship Scheduling: Learning Interpretable Scheduling Policies from Heterogeneous User Demonstrations | Abstract Resource scheduling and coordination is an NP-hard optimization requiring an efficient allocation of agents to a set of tasks with upper- and lower bound temporal and resource constraints. Due to the large-scale and dynamic nature of resource coordination in hospitals and factories, human domain experts manually plan and adjust schedules on the fly. To perform this job, domain experts leverage heterogeneous strategies and rules-of-thumb honed over years of apprenticeship. What is critically needed is the ability to extract this domain knowledge in a heterogeneous and interpretable apprenticeship learning framework to scale beyond the power of a single human expert, a necessity in safety-critical domains. We propose a personalized and interpretable apprenticeship scheduling algorithm that infers an interpretable representation of all human task demonstrators by extracting decision-making criteria specified by an inferred, personalized embedding without constraining the number of decision-making strategies. We achieve near-perfect LfD accuracy in synthetic domains and 88.22% accuracy on a real-world planning domain, outperforming baselines. Further, a user study conducted shows that our methodology produces both interpretable and highly usable models (p < 0.05). In Proc. AAMAS Autonomous Robots and Multirobot Systems (ARMS) Workshop. |
2020 | Journal Papers |
Zheyuan Wang and Matthew Gombolay Learning Scheduling Policies for Multi-Robot Coordination with Graph Attention Networks | Abstract Increasing interest in integrating advanced robotics within manufacturing has spurred a renewed concentration in developing real-time scheduling solutions to coordinate human-robot collaboration in this environment. Traditionally, the problem of scheduling agents to complete tasks with temporal and spatial constraints has been approached either with exact algorithms, which are computationally intractable for large-scale, dynamic coordination, or approximate methods that require domain experts to craft heuristics for each application.We seek to overcome the limitations of these conventional methods by developing a novel graph attention network-based scheduler to automatically learn features of scheduling problems towards generating high-quality solutions. To learn effective policies for combinatorial optimization problems, we combine imitation learning, which makes use of expert demonstration on small problems, with graph neural networks, in a non-parametric framework, to allow for fast, near-optimal scheduling of robot teams with various sizes, while generalizing to large, unseen problems. Experimental results showed that our network-based policy was able to find high-quality solutions for ~90% of the testing problems involving scheduling 2–5 robots and up to 100 tasks, which significantly outperforms prior state-of-the-art, approximate methods. Those results were achieved with affordable computation cost and up to 100x less computation time compared to exact solvers. In. IEEE Robotics and Automation Letters, Volume 5, Issue 3, pages 4509-4516. |
Mariah Schrum and Matthew C. Gombolay When Your Robot Breaks: Active Learning During Plant Failure | Abstract Detecting and adapting to catastrophic failures in robotic systems requires a robot to learn its new dynamics quickly and safely to best accomplish its goals. To address this challenging problem, we propose probabilistically-safe, online learning techniques to infer the altered dynamics of a robot at the moment a failure (e.g., physical damage) occurs. We combine model predictive control and active learning within a chance-constrained optimization framework to safely and efficiently learn the new plant model of the robot. We leverage a neural network for function approximation in learning the latent dynamics of the robot under failure conditions. Our framework generalizes to various damage conditions while being computationally light-weight to advance real-time deployment. We empirically validate within a virtual environment that we can regain control of a severely damaged aircraft in seconds and require only 0.1 seconds to find safe, information-rich trajectories, outperforming state-of-the-art approaches. In. IEEE Robotics and Automation Letters. |
Conference Papers |
Mariah Schrum, Eric Yoon, Matthew Gombolay, Sangwook Yoon A Deep Learning Approach to Efficiently Triaging Spine Surgery Patients Based Upon Computerized Intake Questionnaires | Abstract This is the first study investigating the efficacy of using intake questionnaires for triaging patients for spine surgery via deep neural networks. Our results show that we are able to tune the algorithm to prioritize capturing surgical patients or reducing nonsurgical patients. This system can be adapted to different priorities in an automated manner and to suit the needs of individual providers or spine center as a whole. Since this is an automated system, this system is also scalable without increasing per triage cost. In Proc. Lumbar Spine Research Society (LSRS). [Podium Talk] |
Letian Chen, Rohan Paleja, and Matthew Gombolay Learning from Suboptimal Demonstration via Self-Supervised Reward Regression | Abstract Learning from Demonstration (LfD) seeks to democratize robotics by enabling non-roboticist end-users to teach robots to perform a task by providing a human demonstration. However, modern LfD techniques, e.g. inverse reinforcement learning (IRL), assume users provide at least stochastically optimal demonstrations. This assumption fails to hold in most real-world scenarios. Recent attempts to learn from sub-optimal demonstration leverage pairwise rankings and following the Luce-Shepard rule. However, we show these approaches make incorrect assumptions and thus suffer from brittle, degraded performance. We overcome these limitations in developing a novel approach that bootstraps off suboptimal demonstrations to synthesize optimality-parameterized data to train an idealized reward function. We empirically validate we learn an idealized reward function with ~0.95 correlation with ground-truth reward versus ~0.75 for prior work. We can then train policies achieving ~200% improvement over the suboptimal demonstration and ~90% improvement over prior work. We present a physical demonstration of teaching a robot a topspin strike in table tennis that achieves 32% faster returns and 40% more topspin than user demonstration. In Proc. Conference on Robot Learning (CoRL). [34% Acceptance Rate] [Plenary Talk] [Best Paper Finalist] |
Rohan Paleja, Andrew Silva, Letian Chen, and Matthew Gombolay Interpretable and Personalized Apprenticeship Scheduling: Learning Interpretable Scheduling Policies from Heterogeneous User Demonstrations | Abstract Resource scheduling and coordination is an NP-hard optimization requiring an efficient allocation of agents to a set of tasks with upper- and lower bound temporal and resource constraints. Due to the large-scale and dynamic nature of resource coordination in hospitals and factories, human domain experts manually plan and adjust schedules on the fly. To perform this job, domain experts leverage heterogeneous strategies and rules-of-thumb honed over years of apprenticeship. What is critically needed is the ability to extract this domain knowledge in a heterogeneous and interpretable apprenticeship learning framework to scale beyond the power of a single human expert, a necessity in safety-critical domains. We propose a personalized and interpretable apprenticeship scheduling algorithm that infers an interpretable representation of all human task demonstrators by extracting decision-making criteria specified by an inferred, personalized embedding without constraining the number of decision-making strategies. We achieve near-perfect LfD accuracy in synthetic domains and 88.22% accuracy on a real-world planning domain, outperforming baselines. Further, a user study conducted shows that our methodology produces both interpretable and highly usable models (p < 0.05). In Proc. Conference on Neural Information Processing Systems (NeurIPS). [20% Acceptance Rate] |
Zheyuan Wang and Matthew Gombolay Heterogeneous Graph Attention Networks for Scalable Multi-Robot Scheduling with Temporospatial Constraints | Abstract Robot teams are increasingly being deployed in environments, such as manufacturing facilities and warehouses, to save cost and improve productivity. To efficiently coordinate multi-robot teams, fast, high-quality scheduling algorithms are essential to satisfy the temporal and spatial constraints imposed by dynamic task specification and part and robot availability. Traditional solutions include exact methods, which are intractable for large-scale problems, or application-specific heuristics, which require expert domain knowledge to develop. In this paper, we propose a novel heterogeneous graph attention network model, called ScheduleNet. By introducing robot- and proximity-specific nodes into the simple temporal network encoding temporal constraints, we obtain a heterogeneous graph structure that is nonparametric in the number of tasks, robots and task resources or locations. We show that our model is end-to-end trainable via imitation learning on small-scale problems, generalizing to large, unseen problems. Empirically, our method outperforms the existing state-of-the-art methods in a variety of testing scenarios. In Proc. Robotics: Science and Systems (RSS). [32% Acceptance Rate] |
Mariah L. Schrum*, Michael Johnson*, Muyleng Ghuy*, and Matthew C. Gombolay *denotes co-first authors Four Years in Review: Statistical Practices of Likert Scales in Human-Robot Interaction Studies | Abstract As robots become more prevalent, the importance of the field of human-robot interaction (HRI) grows accordingly. As such, we should endeavor to employ the best statistical practices. Likert scales are commonly used metrics in HRI to measure perceptions and attitudes. Due to misinformation or honest mistakes, most HRI researchers do not adopt best practices when analyzing Likert data. We conduct a review of psychometric literature to determine the current standard for Likert scale design and analysis. Next, we conduct a survey of four years of the International Conference on Human-Robot Interaction (2016 through 2019) and report on incorrect statistical practices and design of Likert scales. During these years, only 3 of the 110 papers applied proper statistical testing to correctly-designed Likert scales. Our analysis suggests there are areas for meaningful improvement in the design and testing of Likert scales. Lastly, we provide recommendations to improve the accuracy of conclusions drawn from Likert data. In Proc. Companion of the International Conference on Human-Robot Interaction (HRI). [alt.HRI Track] [19% Acceptance Rate] |
Letian Chen, Rohan Paleja, Muyleng Ghuy, and Matthew C. Gombolay Joint Goal and Strategy Inference across Heterogeneous Demonstrators via Reward Network Distillation | Abstract Reinforcement learning (RL) has achieved tremendous success as a general framework for learning how to make decisions. However, this success relies on the interactive hand-tuning of a reward function by RL experts. On the other hand, inverse reinforcement learning (IRL) seeks to learn a reward function from readily-obtained human demonstrations. Yet, IRL suffers from two major limitations: 1)reward ambiguity – there are an infinite number of possible re-ward functions that could explain an expert’s demonstration and 2) heterogeneity-human experts adopt varying strategies and preferences, which makes learning from multiple demonstrators difficult due to the common assumption that demonstrators seeks to maximize the same reward. In this work, we propose a method to jointly infer a task goal and humans’ strategic preferences via network distillation. This approach enables us to distill a robust task reward (addressing reward ambiguity) and to model each strategy’s objective (handling heterogeneity). We demonstrate our algorithm can better recover task reward and strategy rewards and imitate the strategies two simulated tasks and a real-world table tennis task. In Proc. International Conference on Human-Robot Interaction (HRI). [24% Acceptance Rate] |
Manisha Natarajan and Matthew C. Gombolay Effects of Anthropormorphism and Accountability on Trust in Human Robot Interaction | Abstract This paper examines how people’s trust and dependence on robot teammates providing decision support varies as a function of different attributes of the robot, such as perceived anthropomorphism, type of support provided by the robot, and its physical presence. We conduct a mixed-design user study with multiple robots to investigate trust, inappropriate reliance, and compliance measures in the context of a time-constrained game. We also examine how the effect of human accountability addresses errors due to over-compliance in the context of human robot interaction (HRI). This study is novel as it involves examining multiple attributes at once, thus enabling us to perform multi-way comparisons between different attributes on trust and compliance with the agent. Results from the 4x4x2x2 study show that behavior and anthropomorphism of the agent are the most significant factors in predicting the trust and compliance with the robot. Furthermore, adding a coalition-building preface, where the agent provides context to why it might make errors while giving advice, leads to an increase in trust for specific behaviors of the agent. In Proc. International Conference on Human-Robot Interaction (HRI). [24% Acceptance Rate] |
Sean Ye, Glen Neville, Mariah Schrum, Matthew Gombolay, Sonia Chernova, and Ayanna Howard Human Trust after Robot Mistakes: Study of the Effects of Different Forms of Robot Communication | Abstract Collaborative robots that work alongside humans will experience service breakdowns and make mistakes. These robotic failures can cause a degradation of trust between the robot and the community being served. A loss of trust may impact whether a user continues to rely on the robot for assistance. In order to improve the teaming capabilities between humans and robots, forms of communication that aid in developing and maintaining trust need to be investigated. In our study, we identify four forms of communication which dictate the timing of information given and type of initiation used by a robot. We investigate the effect that these forms of communication have on trust with and without robot mistakes during a cooperative task. Participants played a memory task game with the help of a humanoid robot that was designed to make mistakes after a certain amount of time passed. The results showed that participants’ trust in the robot was better preserved when that robot offered advice only upon request as opposed to when the robot took initiative to give advice. In Proc. International Conference on Robot and Human Interactive Communication (RO-MAN). |
Esmaeil Seraj and Matthew C. Gombolay Coordinated Control of UAVs for Human-Centered Active Sensing of Wildfires | Abstract Fighting wildfires is a precarious task, imperiling the lives of engaging firefighters and those who reside in the fire’s path. Firefighters need online and dynamic observation of the firefront to anticipate a wildfire’s unknown characteristics, such as size, scale, and propagation velocity, and to plan accordingly. In this paper, we propose a distributed control framework to coordinate a team of unmanned aerial vehicles (UAVs) for a human-centered active sensing of wildfires. We develop a dual-criterion objective function based on Kalman uncertainty residual propagation and weighted multi-agent consensus protocol, which enables the UAVs to actively infer the wildfire dynamics and parameters, track and monitor the fire transition, and safely manage human firefighters on the ground using acquired information. We evaluate our approach relative to prior work, showing significant improvements by reducing the environment’s cumulative uncertainty residual by more than $ 10^2 $ and $ 10^5 $ times in firefront coverage performance to support human-robot teaming for firefighting. We also demonstrate our method on physical robots in a mock firefighting exercise. In Proc. The American Control Conference (ACC). [Best Student Paper Finalist] |
Andrew Silva, Ivan Rodriguez-Jimenez, Taylor Killian, Sung-Hyun Son, and Matthew Gombolay Optimization Methods for Interpretable Differentiable Decision Trees in Reinforcement Learning | Abstract Decision trees are ubiquitous in machine learning for their ease of use and interpretability. Yet, these models are not typically employed in reinforcement learning as they cannot be updated online via stochastic gradient descent. We overcome this limitation by allowing for a gradient update over the entire tree that improves sample complexity affords interpretable policy extraction. First, we include theoretical motivation on the need for policy-gradient learning by examining the properties of gradient descent over differentiable decision trees. Second, we demonstrate that our approach equals or outperforms a neural network on all domains and can learn discrete decision trees online with average rewards up to 7x higher than a batch-trained decision tree. Third, we conduct a user study to quantify the interpretability of a decision tree, rule list, and a neural network with statistically significant results (p < 0.001). In Proc. The International Conference on Artificial Intelligence and Statistics (AISTATS). |
Workshop/Symposium Papers |
Ruisen Liu, Matthew Gombolay, and Stephen Balakirsky Towards Unpaired Human-to-Robot Demonstration Translation Learning Novel Tasks | Abstract Advancements in autonomy can enhance space flight and exploration by enabling robots as cost-efficient agents when humans are unavailable. However, long-term mission success may require continuous maintenance and the ability to adapt on the fly. When encountering a novel scenario that is outside expected robot capabilities, it becomes valuable for a non-robotics expert to be able to visually demonstrate the intended task execution to the robot. Relying on visual demonstration introduces ambiguity in mapping from human to robot execution. One mapping approach is to learn unpaired image translations from human demonstrations and unrelated robot motions. In this paper, we target extensions to image translation to enable robust conveyance of desired task execution. We propose methods to ground generated images with truth in kinematic feasibility, without imposing additional data collection or computational requirements on the demonstrator. In Proc. ICSR Workshop Human Robot Interaction for Space Robotics (HRI-SR). |
Yi Ting Sam, Manisha Natarajan, and Matthew Gombolay Stress and Performance in Human-Robot Space Teleoperation Tasks | Abstract This paper investigates the relationship between stress, workload, and performance in robot teleoperation tasks. The investigation is motivated by the need to develop humanaware robot autonomy for space exploration. Based on prior work, the relationship between stress and performance follows an inverted-U, i.e. there exists an optimal level of stress where performance is maximized. We present a pilot study that utilizes real-time stress sensors on participants undergoing six rounds of stress-inducing or reducing conditions. The performance of the participants is recorded and analyzed with stress levels. We evaluate the relationship between stress (perceived and physiological), workload and performance across three teleoperation tasks. We find that the variation in stress is not significant across different rounds but do observe significance for perceived workload (p < 0.001), stress (p < 0.05), and respiration rate (p < 0.01) for a teleoperation task that requires continuous maneuvering to navigate the robotic arm through a maze. We
propose an improved experimental design to better characterize the stress-performance relationship. In Proc. ICSR Workshop Human Robot Interaction for Space Robotics (HRI-SR). |
Thesis |
Letian Chen and Matthew Gombolay Robot Learning from Heterogeneous Demonstration. | Master’s Thesis. Georgia Institute of Technology |
2019 | Journal Papers |
Matthew Gombolay, Toni Golen, Neel Shah, and Julie Shah Queueing theoretic analysis of labor and delivery | BibTeX @article{Gombolay:2019c, | Abstracttitle={Queueing theoretic analysis of labor and delivery}, author={Gombolay, Matthew and Golen, Toni and Shah, Neel and Shah, Julie}, journal={Health Care Management Science}, pages={1–18}, year={2019}, publisher={Springer} } Childbirth is a complex clinical service requiring the coordinated support of highly trained healthcare professionals as well as management of a finite set of critical resources (such as staff and beds) to provide safe care. The mode of delivery (vaginal delivery or cesarean section) has a significant effect on labor and delivery resource needs. Further, resource management decisions may impact the amount of time a physician or nurse is able to spend with any given patient. In this work, we employ queueing theory to model one year of transactional patient information at a tertiary care center in Boston, Massachusetts. First, we observe that the M/G/∞ model effectively predicts patient flow in an obstetrics department. This model captures the dynamics of labor and delivery where patients arrive randomly during the day, the duration of their stay is based on their individual acuity, and their labor progresses at some rate irrespective of whether they are given a bed. Second, using our queueing theoretic model, we show that reducing the rate of cesarean section – a current quality improvement goal in American obstetrics – may have important consequences with regard to the resource needs of a hospital. We also estimate the potential financial impact of these resource needs from the hospital perspective. Third, we report that application of our model to an analysis of potential patient coverage strategies supports the adoption of team-based care, in which attending physicians share responsibilities for patients. In: Health Care Management Science, 22(1), pp.16-33. |
Workshop/Symposium Papers |
Mariah Schrum and Matthew C. Gombolay Improving Clinical Care of Pediatric Cerebral Palsy Patients with Inverse Reinforcement Learning | Abstract Cerebral palsy (CP) patients exhibit pathological gait patterns as a result of a variety of neuromuscular defects. These gait patterns are typically used to inform therapeutic treatment, yet outcomes vary significantly among individuals within a gait class. We investigate inverse reinforcement learning as an approach to discover latent features of CP gait to help clinicians better understand an individual patient’s pathology and aid in clinical decision making. Furthermore, we develop deep reinforcement learning techniques that can prescribe ways in which a patient’s gait might be altered to help a patient better achieve their ideal gait. In Proc. ICRA Workshop Human Movement Science for Physical Human-Robot Collaboration. |
Rohan Paleja and Matthew C. Gombolay Heterogeneous Learning from Demonstration | Abstract The development of human-robot systems able to leverage the strengths of both humans and their robotic counterparts has been greatly sought after because of the foreseen, broad-ranging impact across industry and research. We believe the true potential of these systems cannot be reached unless the robot is able to act with a high level of autonomy, reducing the burden of manual tasking or teleoperation. To achieve this level of autonomy, robots must be able to work fluidly with its human partners, inferring their needs without explicit commands. This inference requires the robot to be able to detect and classify the heterogeneity of its partners. We propose a framework for learning from heterogeneous demonstration based upon Bayesian inference and evaluate a suite of approaches on a real-world dataset of gameplay from StarCraft II. This evaluation provides evidence that our Bayesian approach can outperform conventional methods by up to 12.8%. In Proc. International Conference on Human Robot Interaction (HRI) Pioneers Workshop. [32% Acceptance Rate] |
arXiV Papers |
Esmaeil Seraj, Andrew Silva, and Matthew C. Gombolay Safe Coordination of Human-Robot Firefighting Teams | Abstract Wildfires are destructive and inflict massive, irreversible harm to victims’ lives and natural resources. Researchers have proposed commissioning unmanned aerial vehicles (UAVs) to provide firefighters with real-time tracking information; yet, these UAVs are not able to reason about a fire’s track, including current location, measurement, and uncertainty, as well as propagation. We propose a model-predictive, probabilistically safe distributed control algorithm for human-robot collaboration in wildfire fighting. The proposed algorithm overcomes the limitations of prior work by explicitly estimating the latent fire propagation dynamics to enable intelligent, time-extended coordination of the UAVs in support of on-the-ground human firefighters. We derive a novel, analytical bound that enables UAVs to distribute their resources and provides a probabilistic guarantee of the humans’ safety while preserving the UAVs’ ability to cover an entire fire. In: arXiv preprint arXiv:1903.06847. |
2018 | Journal Papers |
Matthew C. Gombolay, Reed Jensen, Jessica Stigile, Toni Golen, Neel Shah, Sung-Hyun Son, and Julie A. Shah Human-Machine Collaborative Optimization via Apprenticeship Scheduling | BibTeX @article{DBLP:journals/corr/abs-1805-04220, | Abstractauthor = {Matthew C. Gombolay and Reed Jensen and Jessica Stigile and Toni Golen and Neel Shah and Sung{-}Hyun Son and Julie A. Shah}, title = {Human-Machine Collaborative Optimization via Apprenticeship Scheduling}, journal = {CoRR}, volume = {abs/1805.04220}, year = {2018}, url = {http://arxiv.org/abs/1805.04220}, archivePrefix = {arXiv}, eprint = {1805.04220}, timestamp = {Mon, 13 Aug 2018 16:48:02 +0200}, biburl = {https://dblp.org/rec/bib/journals/corr/abs-1805-04220}, bibsource = {dblp computer science bibliography, https://dblp.org} } Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the “single-expert, single-trainee” apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator. In: Journal of Artificial Intelligence Research, 63, 1-49. |
Rose Molina, Matthew C. Gombolay, Jennifer Jonas, Anna M. Modest, Julie A. Shah, Toni H. Golen, and Neel T. Shah Association Between Labor and Delivery Unit Census and Delays in Patient Management: Findings From a Computer Simulation Module | Abstract OBJECTIVE: To demonstrate the association between increases in labor and delivery unit census and delays in patient care decisions using a computer simulation module. METHODS: This was an observational cohort study of labor and delivery unit nurse managers. We developed a computer module that simulates the physical layout and clinical activity of the labor and delivery unit at our tertiary care academic medical center, in which players act as clinical managers in dynamically allocating nursing staff and beds as patients arrive, progress in labor, and undergo procedures. We exposed nurse managers to variation in patient census and measured the delays in resource decisions over the course of a simulated shift. We used mixed logistic and linear regression models to analyze the associations between patient census and delays in patient care. RESULTS: Thirteen nurse managers participated in the study and completed 17 12-hour shifts, or 204 simulated hours of decision-making. All participants reported the simulation module reflected their real-life experiences at least somewhat well. We observed 1.47-increased odds (95% CI 1.18-1.82) of recommending a patient ambulate in early labor for every additional patient on the labor and delivery unit. For every additional patient on the labor and delivery unit, there was a 15.9-minute delay between delivery and transfer to the postpartum unit (95% CI 2.4-29.3). For every additional patient in the waiting room, we observed a 33.3-minute delay in the time patients spent in the waiting room (95% CI 23.2-43.5) and a 14.3-minute delay in moving a patient in need of a cesarean delivery to the operating room (95% CI 2.8-25.8). CONCLUSION: Increasing labor and delivery unit census is associated with patient care delays in a computer simulation. Computer simulation is a feasible and valid method of demonstrating the sensitivity of care decisions to shifts in patient volume. In: Obstetrics and Gynecology, volume 131, number 3, pages 545-552. |
Matthew C. Gombolay, Ron J. Wilcox, and Julie A. Shah Fast Scheduling of Robot Teams Performing Tasks with Temporospatial Constraints | Abstract The application of robotics to traditionally manual manufacturing processes requires careful coordination between human and robotic agents in order to support safe and efficient coordinated work. Tasks must be allocated to agents and sequenced according to temporal and spatial constraints. Also, systems must be capable of responding on-the-fly to disturbances and people working in close physical proximity to robots. In this paper, we present a centralized algorithm, named “Tercio,” that handles tightly intercoupled temporal and spatial constraints. Our key innovation is a fast, satisficing multi-agent task sequencer inspired by real-time processor scheduling techniques and adapted to leverage hierarchical problem structure. We use this sequencer in conjunction with a MILP solver and empirically demonstrate the ability to generate near-optimal schedules for real-world problems an order of magnitude larger than those reported in prior art. Finally, we demonstrate the use of our algorithm in a multi-robot hardware testbed. In: IEEE Transactions on Robotics (IEEE T-RO), volume 34, number 1, pages 220-239. |
Conference Papers |
Joseph Kim, Matthew E. Woicik, Matthew C. Gombolay, Sung-Hyun Son, and Julie A. Shah Learning to Infer Final Plans in Human Team Planning | Abstract We envision an intelligent agent that analyzes conversations during human team meetings in order to infer the team’s plan, with the purpose of providing decision support to strengthen that plan. We present a novel learning technique to infer teams’ final plans directly from a processed form of their planning conversation. Our method employs reinforcement learning to train a model that maps features of the discussed plan and patterns of dialogue exchange among participants to a final, agreed-upon plan. We employ planning domain models to efficiently search the large space of possible plans, and the costs of candidate plans serve as the reinforcement signal. We demonstrate that our technique successfully infers plans within a variety of challenging domains, with higher accuracy than prior art. With our domain-independent feature set, we empirically demonstrate that our model trained on one planning domain can be applied to successfully infer team plans within a novel planning domain. In Proc. International Joint Conference on Artificial Intelligence (IJCAI). [20% Acceptance Rate] |
2017 | Journal Papers |
Matthew C. Gombolay, Reed Jensen, and Sung-Hyun Son Machine Learning Techniques for Analyzing Training Behavior in Serious Gaming | BibTeX @ARTICLE{Gombolay:2017d, | Abstractauthor={Matthew Gombolay and Reed Jensen and Sung-Hyun Son}, title={Machine Learning Techniques for Analyzing Training Behavior in Serious Gaming} journal={IEEE Transactions on Computational Intelligence and AI in Games (T-CIAIG)}, month = {Accepted September 2017, To Appear}, year={2017} } Training time is a costly, scarce resource across domains such as commercial aviation, healthcare, and military operations. In the context of military applications, serious gaming – the training of warfighters through immersive, realtime environments rather than traditional classroom lectures – offers benefits to improve training not only in its hands-on development and application of knowledge, but also in data analytics via machine learning. In this paper, we explore an array of machine learning techniques that allow teachers to visualize the degree to which training objectives are reflected in actual play. First, we investigate the concept of discovery: learning how warfighters utilize their training tools and develop military strategies within their training environment. Second, we develop machine learning techniques that could assist teachers by automatically predicting player performance, identifying player disengagement, and recommending personalized lesson plans. These methods could potentially provide teachers with insight to assist them in developing better lesson plans and tailored instruction for each individual student. In: IEEE Transactions on Computational Intelligence and AI in Games (T-CIAIG), number 99, pages 1-12. |
Matthew C. Gombolay, Anna Bair, Cindy Huang, and Julie A. Shah Computational Design of Mixed-Initiative Human-Robot Teaming that Considers Human Factors: Situational Awareness, Workload, and Workflow Preferences | BibTeX @ARTICLE{Gombolay:2017e, | Abstractauthor={Matthew Gombolay and Anna Bair and Cindy Huang and Julie Shah}, title={Computational Design of Mixed-Initiative Human-Robot Teaming that Considers Human Factors\: Situational Awareness, Workload, and Workflow Preferences}, journal={International Journal of Robotics Research (IJRR)}, month = {Accepted December 2016, To Appear}, year={2017} } Advancements in robotic technology are making it increasingly possible to integrate robots into the human workspace in order to improve productivity and decrease worker strain resulting from the performance of repetitive, arduous physical tasks. While new computational methods have significantly enhanced the ability of people and robots to work flexibly together, there has been little study into the ways in which human factors influence the design of these computational techniques. In particular, collaboration with robots presents unique challenges related to preserving human situational awareness and optimizing workload allocation for human teammates while respecting their workflow preferences. We conducted a series of three human subject experiments to investigate these human factors, and provide design guidelines for the development of intelligent collaborative robots based on our results. In: International Journal of Robotics Research (IJRR), volume 36, issue 5-7, pages 598-617. |
Conference Papers |
Matthew C. Gombolay, Xi Jessie Yang, Brad Hayes, Nicole Seo, Zixi Liu, Samir Wadhwania, Tania Yu, Neel Shah, Toni Golen, and Julie A. Shah Robotic Assistance in Coordination of Patient Care | BibTeX @article{Gombolay:2017c, | Abstracttitle={Queueing theoretic analysis of labor and delivery}, author={Gombolay, Matthew and Golen, Toni and Shah, Neel and Shah, Julie}, journal={Health Care Management Science}, pages={1–18}, year={2017 }, publisher={Springer} } We conducted a study to investigate trust in and dependence upon robotic decision support among nurses and doctors on a labor and delivery floor. There is evidence that suggestions provided by embodied agents engender inappropriate degrees of trust and reliance among humans. This concern is a critical barrier that must be addressed before fielding intelligent hospital service robots that take initiative to coordinate patient care. Our experiment was conducted with nurses and physicians, and evaluated the subjects’ levels of trust in and dependence on high- and low-quality recommendations issued by robotic versus computer-based decision support. The support, generated through action-driven learning from expert demonstration, was shown to produce high-quality recommendations that were accepted by nurses and physicians at a compliance rate of 90%. Rates of Type I and Type II errors were comparable between robotic and computer-based decision support. Furthermore, embodiment appeared to benefit performance, as indicated by a higher degree of appropriate dependence after the quality of recommendations changed over the course of the experiment. These results support the notion that a robotic assistant may be able to safely and effectively assist in patient care. Finally, we conducted a pilot demonstration in which a robot assisted resource nurses on a labor and delivery floor at a tertiary care center In Proc. Robotics: Science and Systems (RSS). [24% Acceptance Rate] |
Matthew C. Gombolay, Jessica Stigile, Reed Jensen, Sung-Hyun Son, and Julie A. Shah Apprenticeship Scheduling: Learning to Schedule from Human Experts | BibTeX @inproceedings{Gombolay:2016a, | Abstractauthor = {Matthew Gombolay and Reed Jensen and Jessica Stigile and Sung-Hyun Son and Julie Shah}, title = {Decision-Making Authority, Team Efficiency and Human Worker Satisfaction in Mixed Human-Robot Teams}, booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence ({IJCAI})}, address = {New York City, NY, U.S.A.}, month = {July 9-15}, year = {2016} } Training time is a costly, scarce resource across domains such as commercial aviation, healthcare, and military operations. In the context of military applications, serious gaming – the training of warfighters through immersive, realtime environments rather than traditional classroom lectures – offers benefits to improve training not only in its hands-on development and application of knowledge, but also in data analytics via machine learning. In this paper, we explore an array of machine learning techniques that allow teachers to visualize the degree to which training objectives are reflected in actual play. First, we investigate the concept of discovery: learning how warfighters utilize their training tools and develop military strategies within their training environment. Second, we develop machine learning techniques that could assist teachers by automatically predicting player performance, identifying player disengagement, and recommending personalized lesson plans. These methods could potentially provide teachers with insight to assist them in developing better lesson plans and tailored instruction for each individual student. In Proc. International Joint Conference on Artificial Intelligence (IJCAI). [25% Acceptance Rate] |
Workshop/Symposium Papers |
Matthew C. Gombolay, Reed Jensen, Jessica Stigile, Sung-Hyun Son, and Julie Shah Learning to Tutor from Expert Demonstrators via Apprenticeship Scheduling | BibTeX @inproceedings{Gombolay:2017b, | Abstractauthor = {Matthew Gombolay and Reed Jensen and Jessica Stigile and Sung-Hyun Son and Julie Shah}, title = {Learning to Tutor from Expert Demonstration via Apprenticeship Scheduling}, booktitle = {Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI) Workshop on Human-Machine Collaborative Learning (HMCL)}, address = {San Francisco, California}, month = {February 4}, year = {2017}, } We have conducted a study investigating the use of automated tutors for educating players in the context of serious gaming (i.e., game designed as a professional training tool). Historically, researchers and practitioners have developed automated tutors through a process of manually codifying domain knowledge and translating that into a human-interpretable format. This process is laborious and leaves much to be desired. Instead, we seek to apply novel machine learning techniques to, first, learn a model from domain experts’ demonstrations how to solve such problems, and, second, use this model to teach novices how to think like experts. In this work, we present a study comparing the performance of an automated and a traditional, manually-constructed tutor. To our knowledge, this is the first investigation using learning from demonstration techniques to learn from experts and use that knowledge to teach novices. In Proc. AAAI Workshop on Human-Machine Collaborative Learning. |
Rose Molina, Matthew C. Gombolay, Jennifer Jonas, Julie Shah, Toni Golen, and Neel T. Shah Learning to Infer Final Plans in Human Team Planning | Abstract INTRODUCTION: Clinical managers in charge of labor and delivery units are challenged by the need to allocate scarce resources, such as beds and nursing staff, when there are surges in patient volume. When particularly busy, managers may resort to a variety of strategies, such as calling in additional staff or delaying new admissions. The thresholds for applying strategies that preserve labor floor resources have not been previously well defined and may have significant implications for patient safety. METHODS: We developed a virtual labor floor environment in JAVA to simulate dynamic labor floor conditions, including the expected waxing and waning of patient acuity and volume over the course of a nursing shift. We recorded an inter-professional cohort of clinicians making resource allocation decisions over multiple simulated shifts. Using logistic regression, we determined the odds of delaying admission for patients in early labor when labor floor occupancy varied. RESULTS: Eight nurses played the game for 20 minutes on average, which translates to 64 simulated hours of decision-making. In a logistic regression model, we found there is an increased odds of delaying admissions for early labor as the percent of labor and delivery bed occupancy increases (unadjusted odds ratio 1.11, 95% confidence interval 1.03, 1.19). CONCLUSION: Early labor admissions were more often delayed with increasing bed occupancy, indicating that the care of patients on labor and delivery units may be sensitive to the total occupancy of the unit. Additional research is needed to understand the impact of resource management on patient safety. In: Poster Presentation, ACOG Annual Clinical and Scientific Meeting. |
Thesis |
Matthew C. Gombolay Human-Machine Collaborative Optimization via Apprenticeship Scheduling | BibTeX @PHDTHESIS{Gombolay:2017a, | Abstractauthor={Matthew Gombolay}, title={Human-Machine Collaborative Optimization via Apprenticeship Scheduling}, school={Department of Aeronautics and Astronautics, Massachusetts Institute of Technology}, month = {January}, year={2017a} } I envision a future where intelligent service robots become integral members of human-robot teams in the workplace. Today, service robots are being deployed across a wide range of settings; however, while these robots exhibit basic navigational abilities, they lack the ability to anticipate and adapt to the needs of their human teammates. I believe robots must be capable of autonomously learning from humans how to integrate into a team ‘ la a human apprentice. Human domain experts and professionals become experts over years of apprenticeship, and this knowledge is not easily codified in the form of a policy. In my thesis, I develop a novel computational technique, Collaborative Optimization Via Apprenticeship Scheduling (COVAS), that enables robots to learn a policy to capture an expert’s knowledge by observing the expert solve scheduling problems. COVAS can then leverage the policy to guide branch-and-bound search to provide globally optimal solutions faster than state-of-the-art optimization techniques. Developing an apprenticeship learning technique for scheduling is challenging because of the complexities of modeling and solving scheduling problems. Previously, researchers have sought to develop techniques to learn from human demonstration; however, these approaches have rarely been applied to scheduling because of the large number of states required to encode the possible permutations of the problem and relevant problem features (e.g., a job’s deadlines, required resources, etc.). My thesis gives robots a novel ability to serve as teammates that can learn from and contribute to coordinating a human-robot team. The key to COVAS’ ability to efficiently and optimally solve scheduling problems is the use of a novel policy-learning approach – apprenticeship scheduling – suited for imitating the method an expert uses to generate the schedule. This policy learning technique uses pairwise comparisons between the action taken by a human expert (e.g., schedule agent a to complete task [tau]i at time t) and each action not taken (e.g., unscheduled tasks at time t), at each moment in time, to learn the relevant model parameters and scheduling policies demonstrated in training examples provided by the human experts. I evaluate my technique in two real-world domains. First, I apply apprenticeship scheduling to the problem of anti-ship missile defense: protecting a naval vessel from an enemy attack by deploying decoys and countermeasures at the right place and time. I show that apprenticeship scheduling can learn to defend the ship, outperforming human experts on the majority of naval engagements (p < 0.011). Further, COVAS is able to produce globally optimal solutions an order of magnitude faster than traditional, state-of-the-art optimization techniques. Second, I apply apprenticeship scheduling to learn how to function as a resource nurse: the nurse in charge of ensuring the right patient is in the right type of room at the right time and that the right types of nurses are there to care for the patient. After training an apprentice scheduler on demonstrations given by resource nurses, I found that nurses and physicians agreed with the algorithm's advice 90% of the time. Next, I conducted a series of human-subject experiments to understand the human factors consequences of embedding scheduling algorithms in robotic platforms. Through these experiments, I found that an embodied platform (i.e., a physical robot) engenders more appropriate trust and reliance in the system than an un-embodied one (i.e., computer-based system) when the scheduling algorithm works with human domain experts. However, I also found that increasing robot autonomy degrades human situational awareness. Further, there is a complex interplay between workload and workflow preferences that must be balanced to maximize team fluency. Based on these findings, I develop design guidelines for integrating service robots with autonomous decision-making capabilities into the human workplace. Ph.D. Thesis: Doctor of Philosophy in Autonomous Systems. |
2016 | Conference Papers |
Giancarlo Sturla, Matthew C. Gombolay, and Julie A. Shah Incremental Scheduling with Upper and Lowerbound Temporospatial Constraints | BibTeX @inproceedings{Sturla:2016, author = {Giancarlo Sturla and Matthew Gombolay and Julie Shah}, title = {Incremental Scheduling with Upper and Lowerbound Temporospatial Constraints}, booktitle = {Proceedings of AIAA SciTech}, month = {January}, year = {2016}, } In Proc. AIAA SciTech |
Workshop/Symposium Papers and Doctoral Consortia |
Matthew C. Gombolay and Julie A. Shah Apprenticeship Scheduling for Human-Robot Teams.In Proc. Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI-16). Doctoral Consortium, 2016. [39% Acceptance Rate] |
Matthew C. Gombolay and Ankit Shah Appraisal of Statistical Practices in HRI vis-a-vis the T-Test for Likert Items/Scales | BibTeX @inproceedings{Gombolay:2016c, | Abstractauthor = {Matthew Gombolay and Ankit Shah}, title = {Appraisal of Statistical Practices in HRI vis-\´{a}-vis the T-Test for Likert Items/Scales}, booktitle = {Proceedings of AAAI Fall symposium Series on Artificial Intelligence for Human-Robot Interaction (AI-HRI)}, address = {Arlington, Virginia}, month = {November 17–19}, year = {2016}, } Likert items and scales are often used in human subject studies to measure subjective responses of subjects to the treatment levels. In the field of human-robot interaction (HRI), with few widely accepted quantitative metrics, researchers often rely on Likert items and scales to evaluate their systems. However, there is a debate on what is the best statistical method to evaluate the differences between experimental treatments based on Likert item or scale responses. Likert responses are ordinal and not interval, meaning, the differences between consecutive responses to a Likert item are not equally spaced quantitatively. Hence, parametric tests like ttest, which require interval and normally distributed data, are often claimed to be statistically unsound in evaluating Likert response data. The statistical purist would use non-parametric tests, such as the Mann-Whitney U test, to evaluate the differences in ordinal datasets; however, non-parametric tests sacrifice the sensitivity in detecting differences a more conservative specificity – or false positive rate. Finally, it is common practice in the field of HRI to sum up similar individual Likert items to form a Likert scale and use the t-test or ANOVA on the scale seeking the refuge of the central limit theorem. In this paper, we empirically evaluate the validity of the ttest vs. the Mann-Whitney U test for Likert items and scales. We conduct our investigation via Monte Carlo simulation to quantify sensitivity and specificity of the tests. In Proc. AAAI Fall Symposium Series on AI-HRI. |
2015 |
Matthew C. Gombolay, Reymundo A. Gutierrez, Shanelle G. Clarke, Giancarlo F. Sturla, and Julie A. Shah Decision-Making Authority, Team Efficiency and Human Worker Satisfaction in Mixed Human-Robot Teams | BibTeX @ARTICLE{Gombolay:2015a, | Abstractauthor={Gombolay, Matthew and Gutierrez, Reymundo and Clarke, Shanelle and Sturla, Giancarlo and Shah, Julie}, title={Decision-making authority, team efficiency and human worker satisfaction in mixed human-robot teams}, journal={Autonomous Robots}, issn={0929-5593}, volume={39}, number={3}, doi={10.1007/s10514-015-9457-9}, url={http://dx.doi.org/10.1007/s10514-015-9457-9}, publisher={Springer US}, pages={293-312}, year={2015} } In manufacturing, advanced robotic technology has opened up the possibility of integrating highly autonomous mobile robots into human teams. However, with this capability comes the issue of how to maximize both team efficiency and the desire of human team members to work with these robotic counterparts. To address this concern, we conducted a set of experiments studying the effects of shared decision-making authority in human-robot and human-only teams. We found that an autonomous robot can outperform a human worker in the execution of part or all of the process of task allocation (p < 0.001 for both), and that people preferred to cede their control authority to the robot (p < 0.001). We also established that people value human teammates more than robotic teammates; however, providing robots authority over team coordination more strongly improved the perceived value of these agents than giving similar authority to another human teammate (p < 0.001). In post-hoc analysis, we found that people were more likely to assign a disproportionate amount of the work to themselves when working with a robot (p < 0.01) rather than human teammates only. Based upon our findings, we provide design guidance for roboticists and industry practitioners to design robotic assistants for better integration into the human workplace. In: Autonomous Robots, volume 39, issue 3, pages 293-312. |
2014 | Journal Papers |
Matthew C. Gombolay and Julie A. Shah Schedulability Analysis of Task Sets with Upper- and Lower-Bound Temporal Constraints | BibTeX @ARTICLE{Gombolay:2014d, | Abstractauthor={Matthew Gombolay and Julie Shah}, title={Schedulability Analysis of Task Sets with Upper- and Lower-Bound Temporal Constraints}, journal={Journal of Aerospace Information Systems (JAIS)}, volume={11}, number={12}, pages={821-841}, month = {December}, year={2014} } Increasingly, real-time systems must handle the self-suspension of tasks (that is, lower-bound wait times between subtasks) in a timely and predictable manner. A fast schedulability test that does not significantly overestimate the temporal resources needed to execute self-suspending task sets would be of benefit to these modern computing systems. In this paper, a polynomial-time test is presented that is known to be the first to handle nonpreemptive selfsuspending task sets with hard deadlines, where each task has any number of self-suspensions. To construct the test, a novel priority scheduling policy is leveraged, the jth subtask first, which restricts the behavior of the self-suspending model to provide an analytical basis for an informative schedulability test. In general, the problem of sequencing according to both upper-bound and lower-bound temporal constraints requires an idling scheduling policy and is known to be nondeterministic polynomial-time hard. However, the tightness of the schedulability test and scheduling algorithm are empirically validated, and it is shown that the processor is able to effectively use up to 95% of the self-suspension time to execute tasks. In: Journal of Aerospace Information Systems, volume 11, number 12, pages 821-841. |
Conference Papers |
Matthew C. Gombolay, Reymundo A. Gutierrez, Giancarlo F. Sturla, and Julie A. Shah Decision-Making Authority, Team Efficiency and Human Worker Satisfaction in Mixed Human-Robot Teams | BibTeX @inproceedings{Gombolay:2014b, | Abstractauthor = {Matthew Gombolay and Reymundo Gutierrez and Giancarlo Sturla and Julie Shah}, title = {Decision-Making Authority, Team Efficiency and Human Worker Satisfaction in Mixed Human-Robot Teams}, booktitle = {Proceedings of Robots: Science and Systems (RSS)}, address = {Berkeley, California}, month = {July 12-16}, year = {2014} } In manufacturing, advanced robotic technology has opened up the possibility of integrating highly autonomous mobile robots into human teams. However, with this capability comes the issue of how to maximize both team efficiency and the desire of human team members to work with these robotic counterparts. To address this concern, we conducted a set of experiments studying the effects of shared decision-making authority in human-robot and human-only teams. We found that an autonomous robot can outperform a human worker in the execution of part or all of the process of task allocation (p < 0.001 for both), and that people preferred to cede their control authority to the robot (p < 0.001). We also established that people value human teammates more than robotic teammates; however, providing robots authority over team coordination more strongly improved the perceived value of these agents than giving similar authority to another human teammate (p < 0.001). In post-hoc analysis, we found that people were more likely to assign a disproportionate amount of the work to themselves when working with a robot (p < 0.01) rather than human teammates only. Based upon our findings, we provide design guidance for roboticists and industry practitioners to design robotic assistants for better integration into the human workplace. In Proc. Robotics: Science and Systems (RSS). [32% Acceptance Rate] |
Workshop/Symposium Papers |
Matthew C. Gombolay and Julie A. Shah Increasing the Adoption of Autonomous Robotic Teammates in Collaborative Manufacturing | Abstract Advancements in robotic technology are opening up the opportunity to integrate robot workers into the labor force to increase productivity and efficiency. However, removing control from human-workers for the sake of efficiency may create resistance from the human workers, preventing this technology from being successfully integrated into the workplace. We describe our ongoing work developing an autonomous robotic teammate to work alongside human workers in a collaborative manufacturing environment. Specifically, we want to understand how to maximize team efficiency and human-worker acceptance of their robotic teammates by utilizing carefully designed human-subject experiments. In Proc. International Conference on Human-Robot Interaction (HRI) Pioneers Workshop. [36% Acceptance Rate] |
Matthew C. Gombolay, Cindy Huang, and Julie A. Shah Coordination of Human-Robot Teaming with Human Task Preferences | BibTeX @inproceedings{Gombolay:2014a, | Abstractauthor = {Matthew Gombolay and Julie Shah}, title = {Challenges in Collaborative Scheduling of Human-Robot Teams}, booktitle = {Proceedings of AAAI Fall symposium Series on Artificial Intelligence for Human-Robot Interaction (AI-HRI)}, address = {Arlington, Virginia}, month = {November 13–15}, year = {2014}, } Advanced robotic technology is opening up the possibility of integrating robots into the human workspace to improve productivity and decrease the strain of repetitive, arduous physical tasks currently performed by human workers. However, coordinating these teams is a challenging problem. We must understand how decision-making authority over scheduling decisions should be shared between team members and how the preferences of the team members should be included. We report the results of a human-subject experiment investigating how a robotic teammate should best incorporate the preferences of human teammates into the team’s schedule. We find that humans would rather work with a robotic teammate that accounts for their preferences, but this desire might be mitigated if their preferences come at the expense of team efficiency In Proc. AAAI Fall Symposium Series on AI-HRI. |
Matthew C. Gombolay and Julie A. Shah Challenges in Collaborative Scheduling of Human-Robot Teams | BibTeX @inproceedings{Gombolay:2015c, | Abstractauthor = {Matthew Gombolay and Cindy Huang and Julie Shah}, title = {Coordination of Human-Robot Teaming with Human Task Preferences}, booktitle = {Proceedings of AAAI Fall symposium Series on Artificial Intelligence for Human-Robot Interaction (AI-HRI)}, address = {Arlington, Virginia}, month = {November 12–14}, year = {2015} } We study the scheduling of human-robot teams where the human and robotic agents share decision-making authority over scheduling decisions. Our goal is to design AI scheduling techniques that account for how people make decisions under different control schema. In Proc. AAAI Fall Symposium Series on AI-HRI. |
2013 | Conference Papers |
Matthew C. Gombolay, Ron J. Wilcox, and Julie A. Shah Fast Scheduling of Multi-Robot Teams with Temporospatial Constraints | BibTeX @inproceedings{Gombolay:2013b, | Abstractauthor = {Matthew Gombolay and Ronald Wilcox and Julie Shah}, title = {Fast Scheduling of Multi-Robot Teams with Temporospatial Constraints}, booktitle = {Proceedings of Robots: Science and Systems (RSS)}, address = {Berlin, Germany}, month = {June 24-28}, year = {2013} } New uses of robotics in traditionally manual manufacturing processes require the careful choreography of human and robotic agents to support safe and efficient coordinated work. Tasks must be allocated among agents and scheduled to meet temporal deadlines and spatial restrictions on agent proximity. These systems must also be capable of replanning onthe-fly to adapt to disturbances in the schedule and to respond to people working in close physical proximity. In this paper, we present a centralized algorithm, named Tercio, that handles tightly intercoupled temporal and spatial constraints and scales to larger problem sizes than prior art. Our key innovation is a fast, satisficing multi-agent task sequencer that is inspired by real-time processor scheduling techniques but is adapted to leverage hierarchical problem structure. We use this fast task sequencer in conjunction with a MILP solver, and show that we are able to generate near-optimal task assignments and schedules for up to 10 agents and 500 tasks in less than 20 seconds on average. Finally, we demonstrate the algorithm in a multi-robot hardware testbed. In Proc. Robotics: Science and Systems (RSS). [30% Acceptance Rate] |
Workshop/Symposium Papers |
Matthew C. Gombolay, Ron J. Wilcox, Ana Diaz, Fei Yu, and Julie A. Shah Towards Successful Coordination of Human and Robotic Work using Automated Scheduling Tools: An Initial Pilot Study | BibTeX @inproceedings{Gombolay:2013c, | Abstractauthor = {Matthew Gombolay and Ronald Wilcox and Ana Diaz Artiles and Fei Yu and Julie Shah}, title = {Towards Successful Coordination of Human and Robotic Work using Automated Scheduling Tools: An Initial Pilot Study}, booktitle = {Proceedings of Robots: Science and Systems (RSS) Human-Robot Collaboration Workshop}, address = {Berlin, Germany}, month = {June 24-28}, year = {2013} } With the latest advancements in robotic manufacturing technology, there is a desire to integrate robot workers into the labor force to increase productivity and efficiency. However, coordinating the efforts of humans and robots in close physical proximity and under tight temporal constraints poses challenges in planning and scheduling and the design of human-robot interaction. In prior work, we present a scheduling algorithm capable of performing the coordination of heterogeneous multi-agent teams. Given this capability, we now want to understand how best to implement this technology from a human-centered perspective. Humans derive purpose and identity in their roles at work, and requiring them to dynamically change roles at the direction of an automated scheduling algorithm may result in the human worker feeling devalued. Ultimately, overall productivity of the human-robot team may degrade as a result. In this paper, we report the results of a human-subject pilot study aimed at answering how best to implement such an automated scheduling system. Specifically, we test whether giving humans more control over the task allocation process improves worker satisfaction, and we empirically measure the trade-offs of giving this control in terms of overall process efficiency. In Proc. Robotics: Science and Systems (RSS), Human-Robot Collaboration Workshop |
Thesis |
Matthew C. Gombolay and Julie A. Shah Fast Methods for Scheduling with Applications to Real-Time Systems and Large-Scale, Robotic Manufacturing of Aerospace Structures | BibTeX @MASTERSTHESIS{Gombolay:2013a, S.M. Thesis: Master of Science in Aeronautics and Astronautics.author={Matthew Gombolay}, title={Fast Methods for Scheduling with Applications to Real-Time Systems and Large-Scale Robotic Manufacturing of Aerospace Structures}, school={Department of Aeronautics and Astronautics, Massachusetts Institute of Technology}, month = {June}, year={2013} } |
2012 |
Matthew C. Gombolay and Julie A. Shah A Uniprocessor Scheduling Policy for Non-Preemptive Task Sets with Precedence and Temporal Constraints | BibTeX @inproceedings{Gombolay:2012, | Abstractauthor = {Matthew Gombolay and Julie Shah}, title = {Multiprocessor Scheduler for Task Sets with Well-formed Precedence Relations, Temporal Deadlines, and Wait Constraints}, booktitle = {Proceedings of AIAA Infotech@Aerospace}, address = {Garden Grove, California}, month = {June 19-21}, year = {2012} } We present an idling, dynamic priority scheduling policy for non-preemptive task sets with precedence, wait constraints, and deadline constraints. The policy operates on a well-formed task model where tasks are related through a hierarchical temporal constraint structure found in many real-world applications. In general, the problem of sequencing according to both upperbound and lowerbound temporal constraints requires an idling scheduling policy and is known to be NP-complete. However, we show through empirical evaluation that, for a given task set, our polynomial-time scheduling policy is able to sequence the tasks such that the overall duration required to execute the task set, the makespan, is within a few percent of the theoretical, lowerbound makespan. In Proc. AIAA Infotech@Aerospace. *AIAA Best Intelligent Systems Paper Award 2012 |
2011 | Conference Papers |
Matthew C. Gombolay, Sam Beder, Robert Boggio, John Samsundar, P. Stadter, and P. Binning Scheduling of Oversubscribed Space-Based Sensors for Dynamic Objects of Interest.In Proc. of 9th Annual U.S. Missile Defense Conference and Exhibit, 2011. |
T. Safko, D. Kelly, S. Guzewich, S. Bell, A. S. Rivkin, K. W. Kirby, R. E. Gold, A. F. Cheng, T. M. Aldridge, C. M. Colon, A. D. Colson, D. V. Lantukh, P. Pashai, D. Quinn, E. H. Yun, and the ASTERIA team. ASTERIA: A Robotic Precursor Mission to Near-Earth Asteroid 2002 TD60.In Proc. of Lunar and Planetary Science Conference. |