Ever since the attack on the World Trade Center on September 11, prevention of nuclear terrorist attacks in urban environments has been a major focus for homeland security. To that end, mobile radiation sensor networks that are deployed within a specific area to acquire consecutive measurements are a first line of defense against the illicit movement of nuclear threats. However, sensor network deployment is a complex process imposed on physical and financial constraints and dynamically varying conditions. In this work, reinforcement learning (RL) is applied to control the sequential deployment of a mobile radiation sensor network within a specific geographic area. RL is utilized for dynamically learning of the environment and subsequent decision making on the optimal position of the network sensors driven by shared mutual information. RL has the benefit of allowing the network to learn and update a deployment strategy online from an initially unknown state.

The performance of the RL method is demonstrated through self-contained exploration and interaction between sensors in a source search scenario for detecting a radioactive source with a set of mobile detectors within the space of the University of Texas at San Antonio campus. Results exhibit the efficiency and efficacy of (a-sequential) RL in comparison to the sequential placement of the mobile sensors, showcasing optimality in accuracy and efficiency in source detection.