Artist Recognition

In the realm of art and image recognition, leveraging pre-trained transfer learning models has emerged as a game-changer, amplifying accuracy and efficiency in identifying and classifying artistic elements. Adopting a hierarchical tuning approach, the process begins by harnessing the power of six diverse pre-trained models, each bringing its unique strengths and nuances. These models, ranging from convolutional neural networks like ResNet, VGG, and Inception to advanced architectures like DenseNet and MobileNet, form the foundational bedrock for subsequent refinement.

The initial phase involves training and meticulous analysis of these models, scrutinizing their performance, strengths, and limitations across various artistic genres and visual elements. Following this comprehensive evaluation, a selective curation process identifies three models that exhibit the most promising potential. These chosen models undergo a refined phase of fine-tuning, where the focus narrows to reduced classes and an enhancement in data quality. By optimizing these models through targeted adjustments, such as modifying hyperparameters or incorporating domain-specific features, the aim is to accentuate their accuracy and sensitivity towards artistic nuances. Ultimately, this rigorous hierarchical process culminates in the careful selection of one model that demonstrates the highest prowess after intensive fine-tuning, poised to revolutionize artist recognition with its heightened precision and nuanced understanding of diverse artistic styles.

Approach Illustration

Initial Model Training

The process begins with the initial model training phase, where multiple renowned convolutional neural network (CNN) architectures are employed. This phase leverages a variety of pre-trained models such as InceptionV3, VGG16, VGG19, EfficientNetB2, Xception, and InceptionResNetV2. These models, well-established in the field of image recognition, are equipped with weights pre-trained on large datasets, enabling a head start in the learning process—a technique known as transfer learning. The training loop is meticulously configured with specific hyperparameters, including the choice of optimizer and learning rate, to optimize the model's ability to learn from the data. As the model trains, key performance metrics such as accuracy and the Area Under the ROC Curve (AUC) are monitored to gauge the effectiveness of the learning process. Model weights are saved during this stage, preserving the learned features for future use and refinement.

Model Analysis

At the heart of the process lies the model analysis stage, which serves as the critical evaluation point for the trained models. The models' performance is dissected using a variety of analytical tools. Accuracy and loss history are examined to assess learning progression and convergence over time. A confusion matrix is generated to visualize the model's classification accuracy across different classes, revealing any biases or weaknesses in its predictive capabilities. AUC graphs are plotted to provide a clear picture of the model's discriminative power. This analytical suite provides a comprehensive understanding of the model's current state, informing decisions for subsequent phases of model refinement.

Model Refinement and Retraining

In the model refinement and retraining phase, the focus narrows to the top-performing architectures, suggesting an iterative optimization process. This phase involves a refined training loop, often including advanced configurations and a more selective set of models based on insights gained from the previous analysis. The models typically undergo fine-tuning, a process of adjusting and retraining on the dataset to improve performance metrics. The configurations are carefully adjusted, with a renewed emphasis on hyperparameter tuning to further enhance the model's learning capability. The retraining loop is also where the accuracy and AUC metrics continue to be essential, ensuring the model not only retains its previously learned features but also improves upon them. The refined model weights are saved again, indicating an improvement over the initial training phase.

Final Model: InceptionResNetV2

Central to the entire process is the InceptionResNetV2 model, suggesting its pivotal role in the training regime. This hybrid model combines the strengths of the Inception architecture with the benefits of residual connections, making it a powerful tool for image classification tasks. Its presence at the core of the flowchart symbolizes its potential as the final model choice, likely due to its advanced capabilities in handling complex image data. The InceptionResNetV2's architecture is particularly notable for enabling the training of deeper networks without the hindrance of the vanishing gradient problem, making it an ideal candidate for challenging classification tasks. The iterative process culminates with the InceptionResNetV2, where the compounded knowledge and improvements from both training phases are applied to achieve the most accurate and robust model performance.

Before we get into the Training Part, if you want to know more about the Architectures →

Architectures Explained!

Model Training - Phase 1

Loop [ InceptionV3, VGG16, VGG19, EfficientNetV2S, Xception, InceptionResNetV2 ]

Loop [InceptionV3, VGG16, VGG19, EfficientNetV2S, Xception, InceptionResNetV2 ]

Inference on the Above

In the initial phase of the experiment, six different pre-trained models were employed: InceptionV3, VGG16, VGG19, EfficientNetV2S, Xception, and InceptionResNetV2, for a specific image classification task. These models were fine-tuned on the dataset using a consistent set of hyperparameters, including the choice of optimizers, learning rate, batch size, and data augmentation techniques. The training process involved multiple epochs to allow the models to learn and adapt to the dataset. Performance metrics such as the Area Under the Receiver Operating Characteristic Curve (AUC), accuracy, and loss were monitored over the course of training to evaluate each model's performance.

After thorough analysis of the training results, it was observed that three models, namely Inception-based InceptionResNetV2, Xception, and InceptionV3, consistently exhibited superior performance compared to the other three models. These models showed higher AUC scores, better accuracy, and lower loss values across multiple epochs. This strong performance indicated that these models were able to capture the underlying patterns and features in the dataset effectively, making them promising candidates for further evaluation in the next phases of the experiment.

For the subsequent phases of the experiment, these three shortlisted models were selected to undergo additional refinement, hyperparameter tuning, or ensemble strategies to optimize their performance further. This strategic approach of narrowing down the model selection based on comprehensive training and evaluation allowed for a more focused and efficient exploration of model configurations, leading to better overall results in the later stages of the project.

Model Training - Phase 2 (Optimizing)

Similarly for Phase2 the shortlisted models where trained on the Shorted data, i.e most prominent artists where selected out of the data to handle the imbalance that was intrinsically present. Top11 Artists where selected based on the weightage considered by the number of paintings drawn. The selectively choosen models from the analysis did end up performing far better than their initial starting point.

Accuracy

Loss

Model Training - Phase 3 ~ Final Model

In this Phase the final model after iterative selection, is the InceptionResNetV2 Architecture. After after one last time of training, it achieved the following Micro and Macro Metrics as shown by the Classification Report.

Confusion Matrix on 1 Batch of Data

Classification Report on 1 Batch of Data

Model in Action

Google Sites

Report abuse