Auditory perception of emotions in speech is relevant for humans to optimally navigate the social environment. While sensory perception is known to be influenced by bodily internal states such as anxiety and ambient noise, their relationship on human auditory perception is relatively less understood. In a supervised, internet-based experiment carried out sans the artificially controlled laboratory environment, we asked if the detection sensitivity of emotions conveyed by human speech-in-noise (acoustic signals) is modulated by individual differences in affective internal states, e.g., anxiety. In a task, participants (n=24) accurately discriminated the target emotion conveyed by the temporally unpredictable acoustic signals (signal to noise ratio=10dB), which were manipulated at four levels (Happy, Neutral, Fear and Disgust). We calculated empirical area under the curve (measure of acoustic signal detection sensitivity) based on signal detection theory to quantify our results. Specifically, Disgust and Fear detection sensitivities worsened with increasing individual severities of trait-anxiety. Further, a similar effect was evident when averaging across all four emotions. Altogether, the results suggest that individual trait-anxiety levels moderate the detection of emotions from speech-in-noise, especially those conveying negative/threatening affect. The findings may be relevant for expanding the understanding pertaining to auditory perception anomalies underlying affective states and disorders