I will be working in the space of RL for anomaly detection at Amazon Web Services (AWS) AI Lab in the upcoming summer. Previously, I did a summer internship at Argo AI's autonomous vehicle security and functional safety (FuSa) team in the summer of 2021. Before that, I graduated with honors from the Indian Institute of Technology Bombay and spent a summer as an intern at Duke University.
My research aims to build systems that can learn quickly and are robust to long tail scenarios. Towards that goal, I'm working on,
Environments with sparse rewards and long horizons pose a significant challenge for current reinforcement learning algorithms. A key feature enabling humans to learn challenging control tasks is that they often receive expert intervention that enables them to understand the high-level structure of the task before mastering low-level control actions. We propose a framework for leveraging expert intervention to solve long-horizon reinforcement learning tasks. We consider option templates, which are specifications encoding a potential option that can be trained using reinforcement learning. We formulate expert intervention as allowing the agent to execute option templates before learning an implementation. This enables them to use an option, before committing costly resources to learning it. We evaluate our approach on three challenging reinforcement learning problems, showing that it out performs state-of-the-art approaches by an order of magnitude.
Kaustubh Sridhar, Oleg Sokolsky, Insup Lee, James Weimer
Improving adversarial robustness of neural networks remains a major challenge. Fundamentally, training a neural network via gradient descent is a parameter estimation problem. In adaptive control, maintaining persistency of excitation (PoE) is integral to ensuring convergence of parameter estimates in dynamical systems to their true values. We show that parameter estimation with gradient descent can be modeled as a sampling of an adaptive linear time-varying continuous system. Leveraging this model, and with inspiration from Model-Reference Adaptive Control, we prove a sufficient condition to constrain gradient descent updates to reference persistently excited trajectories converging to the true parameters. The sufficient condition is achieved when the learning rate is less than the inverse of the Lipschitz constant of the gradient of loss function. Our experimental results in both standard and adversarial training illustrate that networks trained with the PoE-motivated learning rate schedule have similar clean accuracy but are significantly more robust to adversarial attacks than state-of-the-art models.
Yiannis Kantaros, Taylor Carpenter, Kaustubh Sridhar, Yahan Yang, Insup Lee, James Weimer
We propose a novel attack- and dataset-agnostic and real-time detector for digital / physical adversarial inputs to DNN-based perception systems. We demonstrate the efficiency of the proposed detector on ImageNet, a task that is computationally challenging for the majority of relevant defenses, and on physically attacked traffic signs that may be encountered in real-time autonomy applications.
arXiv (TBA) / code (TBA)
Machine learning models are prone to making incorrect predictions on inputs that are far from the training distribution. This hinders their deployment in safety-critical applications such as autonomous vehicles and healthcare. The detection of a shift from the training distribution of individual samples has gained attention. A number of techniques have been proposed for such out-of-distribution (OOD) detection. But in many applications, the inputs to a machine learning model form a temporal sequence and existing techniques do not exploit these temporal relationships or provide any guarantees on detection. We develop a self-supervised learning approach, CODiT for OOD detection of time-series data with guarantees on detection by using the deviation in the in-distribution (iD) temporal equivariance learned by a model as the non-conformity measure in the conformal anomaly detection framework. We illustrate the efficacy of CODiT by achieving state-of-the-art results on computer vision datasets in autonomous driving, and the GAIT sensory dataset.
Kaustubh Sridhar, Radoslav Ivanov, Marcio Juliato*, Manoj Sastry*, Vuk Lesi*, Lily Yang*, James Weimer, Oleg Sokolsky, Insup Lee
Collaboration with Intel Labs(*).
We tackle the problem of making complex resource-constrained cyber-physical systems (CPS) resilient to sensor anomalies. We present a framework for checkpointing and roll-forward recovery of state-estimates in nonlinear, hierarchical CPS with anomalous sensor data. A simulated ground robot case-study demonstrates its scalability and improved performance over an Extended Kalman Filter.
Information about my research at Duke University, IIT Bombay and the Indian Institute of Science (IISc) Banglore on ground and aerial robotics can be found here.
I enjoy a game of tennis and squash and avidly read science-fiction.
Some other links: