Research course

Applying Multimodal Foundation Models to Identify Small and Not Well-Defined Objects

Institution
University of Salford · School of Science, Engineering and Environment
Qualifications
PhD

Entry requirements

Please use this Research Proposal, Personal statement and CV writing guide when preparing an application.

Months of entry

Anytime

Course content

The field of computer vision has witnessed remarkable advancements, fuelled by the development of large-scale foundation models. These models, capable of processing and understanding multiple modalities such as text and images, have opened up new possibilities for a wide range of applications. One particularly promising area is the identification of small and not well-defined objects, which has significant implications for fields like medical imaging, healthcare, and remote sensing.

Objectives:

Our first objective is to design and optimize a multi-modal vision model that integrates both visual and textual data to enhance the detection of small, ambiguous objects in medical images. Specifically, we aim to improve the accuracy of identifying early-stage lesions by at least 10% over current benchmarks. This research will leverage state-of-the-art fine-tuning techniques and data fusion methods, with initial development and testing scheduled within the first 18 months. By advancing detection capabilities in medical imaging, we hope to contribute directly to early disease diagnosis and improved patient outcomes.

· The second objective focuses on developing a robust, end-to-end computer vision pipeline tailored for remote sensing applications. Our goal is to accurately identify subtle features such as minor infrastructural elements and environmental changes, reducing false positives by approximately 15%. By incorporating recent advancements in model interpretability and multi-modal learning, this pipeline will be designed to deliver transparent and reliable results that domain experts can trust. We plan to validate and benchmark the system within the first two years, ensuring its practical impact in environmental monitoring and disaster management.

· Our third objective is to assess and enhance the generalizability of large-scale foundation models across diverse domains, ranging from medical imaging to remote sensing. We aim to achieve an average detection performance improvement of 15% on multiple benchmark datasets by systematically adapting and refining these models for different applications. Through comprehensive cross-validation and iterative model enhancements throughout the PhD program, this research will offer valuable insights into model adaptability, ultimately ensuring that these advanced models are effective across various high-impact real-world scenarios.



Fees and funding

This programme is self-funded.

To enquire about University of Salford funding schemes – including the Widening Participation Scholarship – visit this website.

Qualification, course duration and attendance options

  • PhD
    full time
    36 months
    • Campus-based learningis available for this qualification
    part time
    60 months
    • Campus-based learningis available for this qualification

Course contact details

Name
SEE PGR Support
Email
PGR-SupportSSEE@salford.ac.uk