Richard Diehl Martinez

I am a Research Fellow in Computer Science at the University of Cambridge.

I completed my PhD at the University of Cambridge as a Gates Scholar. Before that, I was an Applied Research Scientist at Amazon Alexa, focusing on language modeling research. I have a M.S. in Computer Science and a B.S. in Management Science from Stanford University.

Currently, I research pre-training techniques and develop analytical methods to improve the performance and efficiency of language models. More broadly, my interests lie at the intersection of machine learning, linguistics, and neuroscience.

I'm the creator of:

Pico

An open-source framework for small language model research and development.

Learn more about my work:

Select Publications

Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research

Richard Diehl Martinez, David Demitri Africa, Yuval Weiss, Suchir Salhan, Ryan Daniels, Paula Buttery

Conference: EMNLP 2025 (System Demonstrations)

Building language models (LMs), especially small and medium ones, remains more art than science. We introduce Pico, a lightweight, modular framework that enables systematic, hypothesis-driven research for small and medium-scale language model development with reproducible experimentation.

Read Paper →

Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models

Yuval Weiss, David Demitri Africa, Paula Buttery, Richard Diehl Martinez

Workshop: BlackboxNLP 2025

We investigate whether ReLoRA, a parameter-efficient method that works well for large language models, is beneficial for small language models. Our analysis reveals insights about rank deficiency and learning dynamics in smaller models.

Read Paper →

Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

David Demitri Africa, Suchir Salhan, Yuval Weiss, Paula Buttery, Richard Diehl Martinez

Workshop: MRL 2025

We explore whether small decoder LMs can be pretrained to adapt quickly and transfer zero-shot to languages unseen during pretraining. Using MAML with Tagalog and Cebuano, we achieve significant improvements in zero-shot named entity recognition.

Read Paper →

Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Richard Diehl Martinez, Zebulon Goriely, Andrew Caines, Paula Buttery, Lisa Beinborn

Conference: EMNLP 2024

Language models over-rely on token frequency during pre-training, leading to poor generalization for infrequent tokens and anisotropic representations. Our method, Syntactic Smoothing, induces a syntactic prior to improve model performance on rare tokens and reduce anisotropy.

Read Paper →

Tending Towards Stability: Convergence Challenges in Small Language Models

Richard Diehl Martinez, Pietro Lesci, Paula Buttery

Conference: EMNLP Findings 2024

Smaller language models struggle to converge as efficiently as larger ones, especially in later training stages. Our analysis of the Pythia model suite investigates how the effective rank of parameters impacts convergence dynamics across model sizes.

Read Paper →

CLIMB: Curriculum Learning for Infant-inspired Model Building

Richard Diehl Martinez, Hope McGovern, Zebulon Goriely, Christopher Davis, Andrew Caines, Paula Buttery, Lisa Beinborn

Conference: CoNLL 2023 (Best Paper)

Our CLIMB model, built for the BabyLM Challenge, uses cognitively-inspired curriculum learning to improve small language model training. We explore vocabulary, data, and objective curricula to enhance linguistic generalization capabilities.

Read Paper →

Attention-based Contextual Language Model Adaptation for Speech Recognition

Richard Diehl Martinez, Scott Novotney, Ivan Bulyko, Ariya Rastrow, Andreas Stolcke, Ankur Gandhe

Conference: ACL 2021

We introduce a contextual attention mechanism for language models in speech recognition, incorporating non-linguistic data like utterance time. Our model outperforms conventional LMs, reducing perplexity by 9.0% in long-tail utterances.

Read Paper →

Automatically Neutralizing Subjective Bias in Text

Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, Diyi Yang

Conference: AAAI 2020

We present the first dataset for automatically neutralizing subjective bias in text, sourced from Wikipedia edits. Our BERT-based models achieve strong performance in identifying and neutralizing biased language across four domains.

Read Paper →