Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle

Name: Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle
Start: 2026-05-26T22:00:00+00:00
End: 2026-05-27T00:00:00+00:00
Location: 2112 Pennsylvania Ave NW, Washington, DC 20037, USA

Join us for an exciting talk by Mary Gibbs, Senior Applied Scientist at Relativity.

Agenda:

6:00 - 6:30 PM - Welcome and mingle6:30- 6:45 PM - Introductions6:45 - 7:30 PM - Talk7:30 - 8:00 PM - Wrap up

Description:

If you have ever shipped a model, watched your metrics improve, and later learned from your users that something was wrong, the metrics were always wrong. You just didn’t know it yet. An evaluation consists of three components, a benchmark, a scorer, and a claim about what a score represents. Each component has its own weaknesses. Benchmarks can suffer from narrow coverage, contamination, or saturation. Scorers are often chosen for ease of automation or computation rather than for their alignment with user outcomes. And the claim connecting a score to reality is rarely made explicit. These gaps compound across the model development lifecycle. When metrics improve, teams treat that as a signal and optimize directly against it, which is how a measurement problem becomes a model problem. This talk maps where evaluations can go wrong, considers counterarguments, and ends with practical advice for building better ones.

Speaker Bio:

Mary is a Senior Applied Scientist at Relativity, tackling data science challenges in the e-discovery and legal tech space. She is also an organizer for Women and Gender eXpansive Coders DC (formerly Women Who Code DC), fostering a community dedicated to empowering women and nonbinary individuals to excel in their careers. Mary's experience spans various domains. She has developed data science solutions related to job search and career progression at Teal, cybersecurity challenges at LiveAction Software, and commercial and government consulting at Mosaic Data Science. Before venturing into the field of data science, Mary conducted and published research pertaining to the cellular and molecular mechanisms underlying neurodevelopment at the National Institutes of Health. In other words, she has dissected and imaged a lot of fruit fly brains. She holds a M.S. in Data Science from The George Washington University and a B.A. in Biological Sciences from Cornell University.

About Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle in Washington

Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle is a free independent taking place on Tuesday, May 26, 2026 at 2112 Pennsylvania Ave NW, Washington, DC 20037, USA, Washington, United States. Attendance is free — register to secure your spot. Currently 23 people have registered out of 23 spots. The event runs for approximately 2 hours.

Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle — key details

Date:: Tuesday, May 26, 2026
Time:: 10:00 PM – 12:00 AM (America/New_York)
Duration:: 2 hours
Location:: 2112 Pennsylvania Ave NW, Washington, DC 20037, USA, Washington, United States
Price:: Free
Format:: In-person
Attending:: 23 registered

What to expect at Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle

Join this independent over 2 hours for an engaging session of learning, discussion, and networking with fellow attendees.

Who should attend Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle?

This independent in Washington is ideal for:

AI and machine learning enthusiasts and practitioners
data professionals and analytics enthusiasts

Why attend?

Free to attend — no financial commitment required
Intimate group size means more meaningful connections

Washington event — May 2026

This evening independent is part of the growing events scene in Washington. Whether you're based in Washington or visiting for the independent, it's a great opportunity to connect with the local community. Browse more upcoming events in Washington on Rifio.

Topics and categories

Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle covers topics including AI, DC Data & AI Events. Find similar events by browsing these topics on Rifio.

Frequently asked questions about Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle

When is Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle?: Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle takes place on Tuesday, May 26, 2026 from 10:00 PM to 12:00 AM (America/New_York).
Where is Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle held?: Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle takes place at 2112 Pennsylvania Ave NW, Washington, DC 20037, USA, Washington, United States.
How much does Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle cost?: Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle is free to attend. Register on the event page to secure your spot.
How many people are attending Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle?: 23 people have registered so far out of 23 available spots.
How do I register for Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle?: You can register for Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle through the original event page on Luma. Click the "View on Luma" button above, or save it to your Rifio account to get reminders.
How long is Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle?: Your Evals Are Bad And You Should Feel Bad: Evaluation and the Model Development Lifecycle runs for approximately 2 hours, from 10:00 PM to 12:00 AM (America/New_York).