Hello, friend.

Lucy Li in a sweater with a big rock

I'm a PhD student at UC Berkeley's School of Information and BAIR, working on natural language processing (NLP) and computational social science. I'm supported by a NSF Graduate Research Fellowship and advised by David Bamman.

Katie Keith and I have a podcast, Diaries of Social Data Research, where we chat with researchers on the process behind interdisciplinary papers.

Pronouns: she/her


I'm interested in computational sociolinguistics and NLP for addressing social scientific questions.

Here's my CV, which also happens to have a slightly chronological, non-exhaustive list of people who took a leap of faith to support, teach, and guide me in computing research.

How do people talk about other people?
People's conversations are people-centric, since much of what they say focuses on others. I use NLP methods to quantify the discussion of individuals and social groups in text, including textbooks, fiction, and online forums.

What's in the long tail of the English language?
Digital data is filled with communities that use distinctive language, such as innovative or unique words and meanings. I'm mapping out linguistic landscapes in a variety of domains, including scientific articles and social media.

Select Publications

I publish with my name backwards, so citations should refer to "L. Lucy". I do this because my last name is one of the most common in the world, researchers are often recognized and remembered by last name, and computer vision researcher Fei-Fei Li does this, too. More thoughts from others about names and academia, here.

* = equal contribution

Discovering Differences in the Representation of People using Contextualized Semantic Axes
Li Lucy, Divya Tadimeti, David Bamman.
EMNLP 2022.
Paper. Code.

Characterizing English Variation across Social Media Communities with BERT
Li Lucy, David Bamman.
TACL 2021.
Paper. Code.

Gender and Representation Bias in GPT-3 Generated Stories
Li Lucy, David Bamman.
Narrative Understanding Workshop @ NAACL 2021.
Paper. Code.

Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks
Li Lucy*, Dora Demszky*, Patricia Bromley, Dan Jurafsky.
AERA Open 2020.
Best Paper Award. Paper. Code. Slides.

Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning.
Li Lucy, Jon Gauthier.
RoboNLP Workshop @ ACL 2017.
Paper. Poster.


Here's some Bay Area photos.

I'm a picky plant collector.

My friend's cat, Coco, would like some more Instagram followers: link.

Here are some Medium posts about NLP pedagogy.