Research - Eva Martin

Research Approach

I approach research as an exercise in asking better questions. The most interesting problems often sit at disciplinary boundaries, where the messy realities of human behavior meet computational analysis. What excites me most about this moment in data science is the emergence of foundation models and generative AI—tools that can reason about context, understand nuance, and tackle problems that were intractable just years ago. These modern approaches particularly shine when dealing with subjective classifications, cultural contexts, and incomplete information. My work explores how to leverage these powerful new capabilities responsibly and effectively, especially in domains where traditional ML approaches have hit their limits.

Research Interests

My research focuses on developing computational methods for problems where human behavior, culture, and technology intersect. I'm particularly drawn to domains where traditional approaches have failed specific populations or where systematic biases create predictable blind spots.

Crisis Informatics & Disaster Response

Building on my thesis work in disaster image classification, I'm interested in how computational methods can improve crisis response systems. Cultural context fundamentally shapes emergency behavior: Mediterranean fatalism produces entirely different responses than Nordic preparedness culture. I want to understand these patterns and develop AI systems that account for how communities actually respond to threats, not just how our models predict they should. The goal is better resource allocation and more effective humanitarian response.

Risk Perception & Decision-Making

Human risk assessment diverges from statistical reality in ways that correlate strongly with social factors. Demographics, cultural background, political affiliation, and social networks often predict risk behavior better than actual threat levels. I'm interested in using causal ML to model how these social dimensions shape decision-making. Whether it's vaccine hesitancy, disaster evacuation, or financial planning, understanding the interplay between social identity and risk perception could fundamentally improve how we design interventions and communicate about uncertainty.

Adaptive Communication Technologies

Recent studies suggest many individuals with severe communication disorders have hidden cognitive capabilities that our assessment methods simply miss. I'm interested in developing ML systems that learn from interaction patterns over time. These systems would distinguish between motor and cognitive limitations, decode disrupted communication in conditions like aphasia, and create interfaces that preserve dignity while maximizing expression. One study found that up to a third of Rett syndrome patients showed normal cognition when properly assessed. How many people are we systematically underestimating?

Computational Approaches to Incomplete Data

Some of the most compelling puzzles involve severe data limitations. Finding archaeological sites through satellite imagery, tracking historical disease spread through fragmentary records, searching for missing aircraft through oceanographic patterns: these problems require creative methodological approaches. I'm drawn to scenarios where we must first discover what patterns to look for, then develop novel ways to detect them despite missing information.

Recent Research Projects

MSc in Data Science & AI (2022-2025)

Throughout my MSc at University of London, Goldsmiths, I explored diverse applications of computational methods to real-world problems. My coursework projects investigated social factors in risk-taking (replicating MIT's study on pilots flying through thunderstorms), developed decision support systems for education policy using OECD data, and applied NLP techniques to disaster response message classification. I also worked on wildfire prediction using spatiotemporal data and examined public trust in science through survey analysis. This breadth of projects reinforced my interest in problems where human behavior meets computational analysis.

Featured Research: Crisis Informatics & Modern AI

Improving Image Classification in Crisis Informatics Through Modern Generative Approaches

MSc Thesis · University of London, Goldsmiths · 2025 · Grade: Distinction

This research investigates whether modern AI techniques can improve disaster image classification for humanitarian response. I compared two fundamentally different approaches to beating established benchmarks: enhancing Convolutional Neural Network performance through targeted synthetic data augmentation, and using large multimodal models to classify disaster images directly without any task-specific training.

Both approaches were evaluated on the MEDIC crisis informatics dataset (71,198 images) across four classification tasks: disaster type, damage severity, informativeness for humanitarian response, and humanitarian category classification.

Read the Full Thesis

The complete thesis includes detailed methodology, comprehensive results analysis, literature review, and future research directions.

Download PDF View Code Repository

Key Research Findings

Foundation Models Outperform Traditional Approaches

GPT-4o achieved superior performance on challenging categories without any task-specific training. The advantage was particularly pronounced for ambiguous categories where context matters more than visual patterns.

Key finding: Foundation models achieved F1 scores of 57-61% where CNNs only reached 24-36% on the most ambiguous categories

Dataset Quality Matters

Analysis of the MEDIC dataset revealed significant quality problems that motivated the development of a conservative relabelling approach. Our multi-model relabelling methodology (requiring both CNN consensus and dual LLM agreement) corrected ~5% of labels across the dataset.

Impact: The most problematic categories saw their F1 scores more than double after relabelling, while overall task performance improved by 3-4 points

Synthetic Data: Limitations and Insights

Advanced synthetic data augmentation using LLM-generated captions and diffusion models revealed both technical limitations and insights about model behavior. Safety filters frequently blocked realistic disaster imagery involving people, while t-SNE analysis showed synthetic images populated new feature space regions rather than reinforcing existing class boundaries.

Synthetic Image Generation Example

Figure 1: Original disaster image

The LLM generates this description: "Ground-level view of a narrow urban alleyway covered in brick debris and rubble following an earthquake. Chinese rescue workers in bright red uniforms huddle in the foreground left, while damaged but standing buildings line both sides."

Figure 2: Synthetic image generated from description alone

The remarkable similarity between original and synthetic images demonstrates the pipeline's effectiveness. However, feature space analysis revealed that synthetic images occupied distinct regions, introducing novel visual patterns rather than reinforcing existing training distributions.

Technical Implementation

Baseline & Dataset Enhancement

Performance Replication: Successfully reproduced original MEDIC benchmarks across ResNet50, EfficientNet-B1, MobileNet-V2

Training Optimization: 7-18× faster training via custom NVIDIA DALI pipeline

Conservative Relabelling: Multi-CNN disagreement detection plus dual LLM consensus approach

Synthetic Data Strategy

Humanitarian Impact Allocation: 10,000 synthetic images allocated using weighted formula prioritizing affected people (+28) and rescue efforts (+20)

Generation Pipeline: LLM captioning (Claude 3.5, GPT-4o) → Flux 1-dev diffusion model

Diversity Engineering: Custom keywords for temporal, geographic, and perspective variation

Zero-Shot Classification

Model Evaluation: GPT-4o, Claude 3.5 Sonnet, Pixtral Large/Small, Claude Haiku

Prompt Engineering: Five distinct strategies tested, incorporating original MEDIC annotation guidelines

Validation Methodology: Bootstrap confidence intervals, McNemar significance tests

Analysis Framework

Multi-task Learning: Simultaneous prediction across all four classification tasks

Feature Space Analysis: t-SNE visualization of penultimate layer embeddings

Error Pattern Analysis: Cross-task correlation identification and interpretation

Methodological Contributions

This research contributed methodological innovations including a conservative multi-model relabelling pipeline for noisy datasets, a humanitarian-impact allocation approach for synthetic data generation, and feature space analysis techniques for understanding augmentation strategy effectiveness. These methodologies extend beyond crisis informatics to any domain requiring classification system improvement with ethical constraints.

Key Insight: Complementary Strengths

The thesis reveals that CNNs and foundation models have complementary strengths: CNNs excel at learning implicit patterns within their training distribution, while foundation models employ broader world knowledge for subjective categories. This suggests hybrid approaches may be optimal—using fast CNNs for clear-cut classifications and foundation models for ambiguous cases requiring conceptual reasoning.

Collaboration Opportunities

I'm actively seeking research collaborations where analytical strategy meets technical implementation. My background combining international business experience with technical data science training positions me well for projects requiring both methodological rigor and practical feasibility assessment.

Particularly interested in:

Crisis informatics and humanitarian technology
Computational social science and behavioral modeling
Accessibility technology and assistive communication
Cross-cultural studies leveraging multilingual AI
Methodological research on AI evaluation and dataset quality

If you're working on problems where human factors complicate traditional computational approaches, I'd love to discuss potential collaboration.

Full Thesis Document

Read the complete thesis online below: