Representing scenes for real-time context classification on mobile devices

Abstract

In this paper we introduce the DCT-GIST image representation model which is useful to summarize the context of the scene. The proposed image descriptor addresses the problem of real-time scene context classification on devices with limited memory and low computational resources (e.g., mobile and other single sensor devices such as wearable cameras). Images are holistically represented starting from the statistics collected in the Discrete Cosine Transform (DCT) domain. Since the DCT coefficients are usually computed within the digital signal processor for the JPEG conversion/storage, the proposed solution allows to obtain an instant and “free of charge” image signature. The novel image representation exploits the DCT coefficients of natural images by modelling them as Laplacian distributions which are summarized by the scale parameter in order to capture the context of the scene. Only discriminative …