Leveraging computer vision and hybrid transfer learning for deceased bird species classification
Recent advancements in deep learning, particularly in computer vision, have driven significant progress in image classification tasks using CNN-based and transformer architectures. These models are widely applied in computer vision due to their robustness and high accuracy. In response to recent global outbreaks of Highly Pathogenic Avian Influenza (HPAI) and the associated risks to New Zealand’s native birds, poultry industry, and public health, our team aims to explore the deployment of computer vision technologies to enhance New Zealand’s avian flu pandemic preparedness.
While computer vision models for classifying live bird species have achieved notable success, often exceeding 90% accuracy through fine-tuning of CNN and transformer-based architectures, they face challenges when applied to images of deceased birds. These images tend to be noisier and lack the distinct features found in live bird images. Our solution aims to develop a fast, lightweight model with minimal GPU and memory requirements, capable of classifying both dead and live bird species and deployable on mobile devices.
To address this challenge, we collected and labelled 1,200 images of dead birds from New Zealand, using EfficientNetB2 and Vision Transformer (ViT) architectures as the backbone of our model.
Our presentation will cover the following objectives:
- Evaluate the performance of EfficientNetB2 and ViT architectures when trained directly on images of deceased birds.
- Explore a hybrid approach that combines CNN and transformer-based transfer learning with traditional classifiers for deceased bird species classification.
We will further elaborate on why this hybrid approach is particularly suited for accurately classifying both deceased and live bird species. Our methodology and results will demonstrate how this hybrid transfer learning approach is ideal for mobile applications. Additionally, we will showcase a demo of our app, highlighting its rapid, accurate, and lightweight mobile-based model deployment.
ABOUT THE AUTHOR
Zhuo Tang is a master’s student in Applied Data Science at the University of Canterbury, with research interests cantered on Data Science, the Application of AI in GI science, and the intersections between these fields.
Richard Dean is a senior data scientist in ESR's core data science team. He has over 20 years’ experience working with health, forensic and environmental data sets, specialising in the development of novel data tools and visualisation techniques. He currently leads a research programme looking at the use of computer vision for rapid diagnostics and is responsible for the development of ESR's digital twin user interface.
Jiawei Zhao is ESR’s machine learning engineer with a doctoral degree in computer science, specializing in natural language processing, computer vision and speech recognition. His research interests centre around advancing the capabilities of these domains, pushing the boundaries of artificial intelligence.
-----
For more information about the eResearch NZ / eRangahau Aotearoa conference, visit:
https://eresearchnz.co.nz/