Smartphone-based identification of the mode of transportation of the user is important for context-aware services. We investigate the feasibility of recognizing the 8 most common modes of locomotion and transportation from the sound recorded by a smartphone carried by the user. We propose a convolutional neural network based recognition pipeline, which operates on the short- time Fourier transform (STFT) spectrogram of the sound in the log domain. Experiment with the Sussex-Huawei locomotion- transportation (SHL) dataset on 366 hours of data shows promising results where the proposed pipeline can recognize the activities Still, Walk, Run, Bike, Car, Bus, Train and Subway with a global accuracy of 86.6%, which is 23% higher than classical machine learning pipelines. It is shown that sound is particularly useful for distinguishing between various vehicle activities (e.g. Car vs Bus, Train vs Subway). This discriminablity is complementary to the widely used motion sensors, which are poor at distinguish between rail and road transport.
The different versions of the original document can be found in: