Abstract: The popularity of ASR (automatic speech recognition) systems, like Google Voice, Cortana, Amazon Echo, brings in security concerns, as demonstrated by recent attacks. The impacts of such threats, however, are less clear, since they are either less stealthy (producing noise-like voice commands), requiring the physical presence of an attack device (using ultrasound), or not practical (unable to attack the physical speech recognition devices). In this talk, I will show that not only are more practical and surreptitious attacks feasible but they can even be automatically constructed. Specifically, the voice commands can be stealthily embedded into songs, which, when played, can effectively control the target system through ASR without being noticed. I will present the novel techniques that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener. Our research shows that this can be done automatically against real world ASR systems, and even devices like Google Home, Amazon Echo, Apple Siri, etc.