Most approaches to assess and improve soundscapes use a holistic approach. On one end, subjective measurements usually involve the evaluation of a complete soundscape by human listeners, mostly through questionnaires. On the other end, psycho-acoustic measurements try to capture low-level perceptual attributes in quantities, such as loudness. However, these two types of soundscape measurements are difficult to link other than by correlational measures. More specifically, the link between subjective human evaluation and psycho-acoustic measures is made directly, while in human soundscape perception many steps are in between. Therefore, we propose a method inspired by cognitive research to get a better understanding of human soundscape perception. Furthermore, by modeling auditory cognition, we aim to improve automatic soundscape evaluation. Instead of trying to measure a complete soundscape, we identify components in a soundscape. Firstly, we select structures from the sound signal that are likely to stem from a single source. Subsequently, we use models of human memory to generate rich descriptions of these components. The model is trained on annotations of human listeners, and during its operation phase it learns to predict the descriptions of human listeners.