Hi there, I'm Kaustubh Sridhar!

Kaustubh Sridhar

I'm a PhD candidate in Electrical and Systems Enginnering at the University of Pennsylvania, where I'm advised by Insup Lee, James Weimer and Oleg Sokolsky. My CV can be found here.

I worked on model-free RL augmentations for model-based virtual machine packing in datacenters at Amazon Web Services (AWS) AI Labs in summer 2022. Previously, I did an internship at Argo AI's autonomous vehicle security and functional safety (FuSa) team in summer 2021. Before that, I graduated with honors from the Indian Institute of Technology Bombay and spent a summer as an intern at Duke University.


My research interests broadly span Deep Reinforcement Learning (RL) and Robust Deep Learning. In the former, I am in interested in sample-efficient RL and RL for combinatorial optimization. In the latter, I am interested in adversarial robustness and out-of-distribution (OOD) detection. My research interests also encompass safety and security for autonomous vehicles and cyber-physical systems. All of my work is summarized below.


Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates

Souradeep Dutta, Kaustubh Sridhar, Osbert Bastani, Edgar Dobriban, James Weimer, Insup Lee, Julia Parish-Morris

CoRL 2022

arXiv / OpenReview / website (and videos) / code (craft and fetch environments, google football environment)

Environments with sparse rewards and long horizons pose a significant challenge for current reinforcement learning algorithms. A key feature enabling humans to learn challenging control tasks is that they often receive expert intervention that enables them to understand the high-level structure of the task before mastering low-level control actions. We propose a framework for leveraging expert intervention to solve long-horizon reinforcement learning tasks. We consider option templates, which are specifications encoding a potential option that can be trained using reinforcement learning. We formulate expert intervention as allowing the agent to execute option templates before learning an implementation. This enables them to use an option, before committing costly resources to learning it. We evaluate our approach on three challenging reinforcement learning problems, showing that it out performs state-of-the-art approaches by an order of magnitude.


Improving Neural Network Robustness via Persistency of Excitation

Kaustubh Sridhar, Oleg Sokolsky, Insup Lee, James Weimer

ACC 2022

arXiv / code / poster / video / leaderboard

Improving adversarial robustness of neural networks remains a major challenge. Fundamentally, training a neural network via gradient descent is a parameter estimation problem. In adaptive control, maintaining persistency of excitation (PoE) is integral to ensuring convergence of parameter estimates in dynamical systems to their true values. We show that parameter estimation with gradient descent can be modeled as a sampling of an adaptive linear time-varying continuous system. Leveraging this model, and with inspiration from Model-Reference Adaptive Control, we prove a sufficient condition to constrain gradient descent updates to reference persistently excited trajectories converging to the true parameters. The sufficient condition is achieved when the learning rate is less than the inverse of the Lipschitz constant of the gradient of loss function. Our experimental results in both standard and adversarial training illustrate that networks trained with the PoE-motivated learning rate schedule have similar clean accuracy but are significantly more robust to adversarial attacks than state-of-the-art models.


CODiT: Conformal Out-of-distribution Detection in Time-series Data

Ramneet Kaur, Kaustubh Sridhar, Sangdon Park, Susmit Jha, Anirban Roy, Oleg Sokolsky, Insup Lee

PODS Workshop, ICML 2022

arXiv / code

Machine learning models are prone to making incorrect predictions on inputs that are far from the training distribution. This hinders their deployment in safety-critical applications such as autonomous vehicles and healthcare. The detection of a shift from the training distribution of individual samples has gained attention. A number of techniques have been proposed for such out-of-distribution (OOD) detection. But in many applications, the inputs to a machine learning model form a temporal sequence and existing techniques do not exploit these temporal relationships or provide any guarantees on detection. We develop a self-supervised learning approach, CODiT for OOD detection of time-series data with guarantees on detection by using the deviation in the in-distribution (iD) temporal equivariance learned by a model as the non-conformity measure in the conformal anomaly detection framework. We illustrate the efficacy of CODiT by achieving state-of-the-art results on computer vision datasets in autonomous driving, and the GAIT sensory dataset.


Real-Time detectors for Digital and Physical Adversarial Inputs to Perception Systems

Yiannis Kantaros, Taylor Carpenter, Kaustubh Sridhar, Yahan Yang, Insup Lee, James Weimer

ICCPS 2021

arXiv / proceedings / dataset webpage

We propose a novel attack- and dataset-agnostic and real-time detector for digital / physical adversarial inputs to DNN-based perception systems. We demonstrate the efficiency of the proposed detector on ImageNet, a task that is computationally challenging for the majority of relevant defenses, and on physically attacked traffic signs that may be encountered in real-time autonomy applications.


Towards Alternative Techniques for Improving Adversarial Robustness: Analysis of Adversarial Training at a Spectrum of Perturbations

Kaustubh Sridhar, Souradeep Dutta, Ramneet Kaur, James Weimer, Oleg Sokolsky, Insup Lee

arXiv / code

Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations and common corruptions in the last few years. Algorithm design of AT and its variants are focused on training models at a specified perturbation strength ϵ and only using the feedback from the performance of that ϵ-robust model to improve the algorithm. In this work, we focus on models, trained on a spectrum of ϵ values. We analyze three perspectives: model performance, intermediate feature precision and convolution filter sensitivity. In each, we identify alternative improvements to AT that otherwise wouldn't have been apparent at a single ϵ. Specifically, we find that for a PGD attack at some strength δ, there is an AT model at some slightly larger strength ϵ, but no greater, that generalizes best to it. Hence, we propose overdesigning for robustness where we suggest training models at an ϵ just above δ. Second, we observe (across various ϵ values) that robustness is highly sensitive to the precision of intermediate features and particularly those after the first and second layer. Thus, we propose adding a simple quantization to defenses that improves accuracy on seen and unseen adaptive attacks. Third, we analyze convolution filters of each layer of models at increasing ϵ and notice that those of the first and second layer may be solely responsible for amplifying input perturbations. We present our findings and demonstrate our techniques through experiments with ResNet and WideResNet models on the CIFAR-10 and CIFAR-10-C datasets.


Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep and Exploratory Research

Pengyuan Lu, Mengyu Liu, Lin Zhang, Kaustubh Sridhar, Oleg Sokolsky, Fanxin Kong, Insup Lee


Cyber-physical systems (CPS) have experienced rapid growth in recent decades. However, like any other computer-based systems, malicious attacks evolve mutually, driving CPS to undesirable physical states and potentially causing catastrophes. Although the current state-of-the-art is well aware of this issue, the majority of researchers focus on attack detection rather than recovery, the procedure that brings a system back to its designed behaviors and minimizes casualty. To call for attention on CPS recovery and identify existing efforts, we have surveyed relevant papers addressing various attack scenarios and system assumptions. We divide recovery techniques into two major categories: shallow vs. deep, where the latter explicitly guides the system back to a pre-defined normal state while the former does not. The two categories are then further broken into sub-categories by specific recovery techniques. Additionally, we surveyed exploratory research on topics that facilitate recovery. From these publications, we analyze possible future directions and challenges based on 6 dimensions of recovery frameworks: soundness, time overhead, development resource overhead, runtime resource overhead, attack scope, and system assumptions. Overall, this survey serves as an organized overview of existing countermeasures against CPS attacks, while showing a large remaining problem space yet to be explored.


Fail-Safe: Securing Cyber-Physical Systems against Hidden Sensor Attacks

Mengyu Liu, Lin Zhang, Pengyuan Lu, Kaustubh Sridhar, Fanxin Kong, Oleg Sokolsky, Insup Lee

RTSS 2022


A pressing risk in CPS is hidden sensor attacks that are designed by powerful attackers with full knowledge of the system and its detector. These hidden attacks inject a small malicious signal into sensor measurements that can be ignored by the onboard detector. To secure CPS, we propose a detection framework to identify these hidden sensor attacks even when they avoid the onboard detector.


A Framework for Checkpointing and Recovery of Hierarchical Cyber-Physical Systems

Kaustubh Sridhar, Radoslav Ivanov, Marcio Juliato*, Manoj Sastry*, Vuk Lesi*, Lily Yang*, James Weimer, Oleg Sokolsky, Insup Lee

Collaboration with Intel Labs(*).

arXiv / code

We tackle the problem of making complex resource-constrained cyber-physical systems (CPS) resilient to sensor anomalies. We present a framework for checkpointing and roll-forward recovery of state-estimates in nonlinear, hierarchical CPS with anomalous sensor data. A simulated ground robot case-study demonstrates its scalability and improved performance over an Extended Kalman Filter.


  • Outstanding Reviewer (Top 10%), ICML, 2022
  • The Dean’s Fellowship and The Howard Bradwell Fellowship, University of Pennsylvania, 2019
  • SN Bose Scholarship, the Indo-U.S. Science and Technology Forum, 2018
  • KVPY Fellowship, Govt. of India, 2015

Undergraduate Research

Information about my research at Duke University, IIT Bombay and the Indian Institute of Science (IISc) Banglore on ground and aerial robotics can be found here.


I enjoy a game of tennis and squash and avidly read science-fiction.

Some other links: