Shashwat Raj
Computer systems engineering
Hometown: Tempe, Arizona, United States
Graduation date: Spring 2027
Additional details: First-generation college student
GCSP research stipend | Summer 2025
Optimizing Earth Science Observations: Developing Reinforcement Learning Techniques for Autonomously Determining Priority Observations in a Dynamic Environment
Dynamic targeting is an extensively researched concept used by satellites to point the instruments in the optimal direction to get the maximum scientific yield. However, conventional methods use scheduling systems for determining points on the map that are worth observing. Auto-scheduler systems are effective in selecting priority observations, but these methods are not real-time or fully accurate. Henceforth, incorrect decisions by the satellites lead to wastage of power and redundant data. This research aims to enable satellites to autonomously determine valid observation locations in real time during Earth Science Missions. The project focuses on creating an RL environment that emulates the satellite’s decisions to measure over the priority observational points. The project involves usage of large simulated environmental data in the NASA GEOS-5 dataset, model selection and training, testing and tuning of the model, and evaluation through comparison with a random forest classifier model. Ongoing research has seen a policy-based DQN model and a Quantile Regression DQN model perform with 79% recall. However, the results showed a low F1 score due to more false positive actions. The low precision is potentially due to oversampling near the 0° latitude and the huge class imbalance in the datasets derived from NASA GEOS-5. The project therefore focuses on preparing larger training datasets of 8-12 months with additional features like cloud optical thickness and incoming shortwave flux. The project further aims to use decision transformers or STTNs in ensemble with reinforcement learning to tackle the oversampling challenge. The research will be considered a success when the results correspond with the simulated data on convective precipitation storms, proving a greater precision than predictive supervised learning models like the random forest classifier. The ultimate goal of the research is to autonomously enable satellites in prioritizing areas with high atmospheric activity, enhancing environmental monitoring and providing deeper insights to Earth’s atmospheric dynamics.
Glossary: RL (reinforcement learning), DQN (Deep Q-Network), QRDQN (Quantile Regression Deep Q-Network), STTNs (Spatio-Temporal Transformer Networks), GEOS-5 (Goddard Earth Observing System, Version 5)
Mentor: Paul Grogan