Recently, there is an increasing tendency to embed the functionality of recognizing emotions from the user generated contents, to infer richer profile about the users or contents, that can be used for various automated systems such as call-center operations, recommendations, and assistive technologies. However, to date, adding this functionality was a tedious, costly, and time consuming effort, and one should look for different tools that suits one's needs, and should provide different interfaces to use those tools. The MixedEmotions toolbox leverages the need for such functionalities by providing tools for text, audio, video, and linked data processing within an easily integrable plug-and-play platform. These functionalities include: (i) for text processing: emotion and sentiment recognition, (ii) for audio processing: emotion, age, and gender recognition, (iii) for video processing: face detection and tracking, emotion recognition, facial landmark localization, head pose estimation, face alignment, and body pose estimation, and (iv) for linked data: knowledge graph. Moreover, the MixedEmotions Toolbox is open-source and free. In this article, we present this toolbox in the context of the existing landscape, and provide a range of detailed benchmarks on standardized test-beds showing its state-of-the-art performance. Furthermore, three real-world use-cases show its effectiveness, namely emotion-driven smart TV, call center monitoring, and brand reputation analysis.