The beneficiary of our services is a company part of the top 100 Forbes, with more than 100 billions EUR in revenues. They are world leaders in Consumer Electronics, Home Appliances, Home Entertainment, Mobile Communications, Solar and Display Technologies.
The challenge was to develop an image recommendation system similar with what Google image search does, for visual arts and creative industries.
The system uses an approach based on Convolutional Neural Networks (CNN). This represents state-of-the art technology in the field of image recognition. At its core, the system runs in two phases:
model training & tuning and feature extraction, performed offline
image query & retrieval, performed online
We embrace Agile methodologies in most of our projects. Many times it is our customers and partners who ask us to use Agile methodologies (Scrum , Kanban) from the inception of the project. When the choice is left to our engineers, we carefully analyze the project specifics, and we propose a project management methodology based on Scrum or Kanban which best fits the specific project needs and context.
The development team consists in 5 R&D developers based in our headquarter and also a development team on our client premises. We work closely with the Client Service Team and the Infrastructure Team to offer tech support and maintenance when needed.
Features are learned in three multi-layer CNNs (art-style, content, style) and then combined in one large feature-vector (meta-descriptor) which is further used for computing the image similarities
- the ‘content’ model is the pre-trained GoogLeNet model described by Szegedy et al. in ILSVRC 2014 and learns 1000 features describing the content of the images
- the ‘art-style’ model is a tuned version of the GoogLeNet model, trained on 85K paintings annotated with 25 style/genre labels
- the ‘style’ model is the pre-trained ‘Flickr Style’ model described by Karayev et al. and outputs 20 features
- a 4th model is used to learn the type of the images (art, content or style) and its output is used as a dynamically feature weighting mechanism
For performance and scalability considerations, feature learning and extraction are performed on the GPU, using the CUDA software architecture provided by Nvidia. Training and feature extraction are done using the Caffe framework, because of its speed and modularity.
Architecture & Technologies
The system is designed with extensibility in mind. It has a core that aggregates multiple pluggable modules: style, art-style, content, color. For system scalability consideration our choice was to run multiple instances of each pluggable module. The frontend has an adaptive/responsive layout using Bootstrap in order to address all kind of devices (including mobile).
- C/C++, Pyton, Java
- Open CV, Open CL, Caffe
- Matlab, Octave
- Machine Learning, Deep Learning