Amanda Dash

BEng (University of Victoria, 2015)

Notice of the Final Oral Examination for the Degree of Doctor of Philosophy

Topic

Addressing Data Scarcity with Computer Vision Methods

Department of Electrical and Computer Engineering

Date & location

Thursday, April 25, 2024
10:00 A.M.
Virtual Defence

Reviewers

Supervisory Committee

Dr. Alexandra Branzan Albu, Department of Electrical and Computer Engineering, University of Victoria (Supervisor)
Dr. Stephen Neville, Department of Electrical and Computer Engineering, UVic (Member)
Dr. Alex Thomo, Department of Computer Science, UVic (Outside Member)

External Examiner

Dr. Mark Eramian, Department of Computer Science, University of Saskatchewan

Chair of Oral Examination

Dr. Pascal Courty, Department of Economics, UVic

Abstract

Data scarcity characterizes situations where the demand for abundant, quality data is greater than their availability. Lack of quality data is a significant issue when designing and implementing computer vision-based algorithms; more specifically, deep learning- based approaches require large amounts of curated data for training and validation. There are many scenarios, such as environmental monitoring, where gathering more data is not viable. This thesis explores different methodologies and strategies for overcoming data scarcity in computer vision algorithms. While addressing all methods for handling data scarcity would be an over-ambitious endeavour, this thesis focuses on three primary strategies for working with small datasets: traditional computer vision, deep learning regularization functions, and synthetic datasets. Detailed objectives, solutions and insights from each are presented for diverse problem domains and case studies within the computer vision field.

The first strategy consists of developing traditional computer vision methods. We discuss this strategy for two case studies: estimating bird population and domain-independent video summarization. The first case study results in a method that integrates motion analysis and segmentation methods to cluster and count birds in large moving flocks, filmed using hand-held video devices by citizen scientists. The second case study addresses the high demand for automatic video summarization systems due to the dramatic increase in media streaming content and consumer-level video creation; our proposed method uses a bottom-up approach for the automatic generation of dynamic video summaries by integrating motion and saliency analysis with temporal slicing.

The second strategy focuses on using regularization functions while training deep learning systems. We propose a novel custom loss function, Dense Loss, which was designed to use local region homogeneity regularization to promote contiguous and smooth segmentation predictions while also using an L1-norm loss to reconstruct dense-labelled annotation ground truth for a synthetic handwritten annotation mixed-media dataset. Regularization also helps when foreground and background classes are not well-represented; we thus propose a texture-based domain-specific data augmentation technique applicable when training on small datasets for deep learning image classification tasks.

The third strategy consists of generating synthetic datasets and evaluating the performance of state-of-the-art deep learning architectures when trained on them. We propose a mosaic texture dataset and an image-to-text table summarization dataset. Both address a lack of data in their corresponding application domains. Our research shows that each application domain affected by data scarcity needs to be thoroughly studied before proposing solutions to mitigate this problem.

Each of the projects developed in this thesis supports the hypothesis that small datasets are viable sources for research and applications when their particularities are addressed during development and implementation.

Back to oral exams