Hello, friend.

Lucy Li in a sweater with a big rock


I'm a PhD student at UC Berkeley's School of Information and BAIR, working on natural language processing (NLP), computational social science, cultural analytics, and AI fairness. I'm supported by a NSF Graduate Research Fellowship and advised by David Bamman. I have been recognized as an Outstanding Intern of the Year at the Allen Institute for AI and a Rising Star in EECS and Data Science.

In Summer 2024, I will be teaching Social Aspects of Natural Language Processing.

Prospective PhD applicants, especially those from underrepresented backgrounds, are welcome to email me questions about the application process or the PhD experience.

Pronouns: she/her

Research

Here's my CV, which includes a non-exhaustive list of people from many institutions (Stanford, EPFL, Microsoft Research, Allen Institute for AI) who took a leap of faith to support, teach, and guide me in research.

I research how social groups are discussed and represented in language models and textual data (e.g. textbooks, fiction, and online forums). Though I publish primarily in computing venues, I am also passionate about bridging NLP with the humanities and social sciences, especially education and curriculum studies.

Katie Keith, Naitian Zhou, and I have a podcast, Diaries of Social Data Research, where we chat with researchers on the process behind interdisciplinary papers.

Select Publications

I publish with my name backwards, so citations should refer to "L. Lucy". I do this because my last name is one of the most common in the world, researchers are often recognized and remembered by last name, and computer vision researcher Fei-Fei Li does this, too. More thoughts from others about names and academia, here.

* = equal contribution

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge.
arXiv preprint. Paper.


"One-size-fits-all"? Expectations of NLG Systems Across Identity-Related Language Features
Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna Wallach, Alexandra Olteanu.
NAACL 2024. Paper.


Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith.
Findings of ACL 2023. Paper. Code.


Discovering Differences in the Representation of People using Contextualized Semantic Axes
Li Lucy, Divya Tadimeti, David Bamman.
EMNLP 2022. Paper. Code.


Characterizing English Variation across Social Media Communities with BERT
Li Lucy, David Bamman.
TACL 2021. Paper. Code.


Gender and Representation Bias in GPT-3 Generated Stories
Li Lucy, David Bamman.
Narrative Understanding Workshop @ NAACL 2021. Paper. Code.


Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks
Li Lucy*, Dora Demszky*, Patricia Bromley, Dan Jurafsky.
AERA Open 2020. Best Paper Award. Paper. Code. Slides.


Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning.
Li Lucy, Jon Gauthier.
RoboNLP Workshop @ ACL 2017. Paper. Poster.

Personal

I was born in and grew up in Minnesota. My cat's name is Toast. When I was a kid, I wanted to be an ornithologist and a fiction writer.