Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Posts

Future Blog Post

less than 1 minute read

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

publications

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D Manning
Paper

Published in Alexa Prize Proceedings! We present Chirpy Cardinal, an open-domain dialogue agent that won 2nd place in the 2019 Alexa Prize Competition. Our open-source code is here!

Effective Social Chatbot Strategies for Increasing User Initiative

Amelia Hardy, Ashwin Paranjape, Christopher Manning
Paper

Published at SIGDIAL! We study strategies for increasing user intiative in human-bot conversations and show that simple automated metrics correlate with human judgment of initiative.

Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent

Ethan A Chi, Ashwin Paranjape, Abigail See, Caleb Chiam, Trenton Chang, Kathleen Kenealy, Swee Kiat Lim, Amelia Hardy, Chetanya Rastogi, Haojun Li, Alexander Iyabor, Yutong He, Hari Sowrirajan, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Jillian Tang, Avanika Narayan, Giovanni Campagna, Christopher D Manning
Paper

Published in SIGDIAL! We present V2 of Chirpy Cardinal, an open-domain dialogue agent that won 2nd place in the 2020 Alexa Prize Competition.

Evaluating Human-Language Model Interaction

Mina Lee, Megha Srivastava, Amelia Hardy, John Thickstun, Esin Durmus, Ashwin Paranjape, Ines Gerard-Ursin, Xiang Lisa Li, Faisal Ladhak, Frieda Rong, Rose E Wang, Minae Kwon, Joon Sung Park, Hancheng Cao, Tony Lee, Rishi Bommasani, Michael Bernstein, Percy Liang
Paper

Published in TMLR! We evaluate human-LM interaction on the tasks of social dialogue, question answering, crossword puzzles, summarization, and metaphor generation and highlight cases where the results from non-interactive and interactive metrics diverge.

Inferring Traffic Models in Terminal Airspace from Flight Tracks and Procedures

Soyeon Jung, Amelia Hardy, Mykel J Kochenderfer

We present a simple and interpretable approach to modeling flight trajectories that leverages Gaussian Mixture Models specific to each flight segment.

BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices

Anka Reuel*, Amelia Hardy*, Chandler Smith, Max Lamparth, Malcolm Hardy, Mykel J. Kochenderfer
Paper

Spotlighted in NeurIPS! In this work, we propose best practices for informative, reproducible AI benchmarks and evaluate a set of benchmarks according to these criteria.

More than Marketing? On the Information Value of AI Benchmarks for Practitioners

Amelia Hardy*, Anka Reuel*, Kiana Jafari Meimandi, Lisa Soder, Allie Griffith, Dylan M Asmar, Sanmi Koyejo, Michael S Bernstein, Mykel J Kochenderfer
Paper

We present a qualitative, interview-based study on how AI benchmarks are used in practice for decision-makers in research, product, and policy roles.

ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts

Amelia Hardy*, Houjun Liu*, Allie Griffith, Bernard Lange, Duncan Eddy, Mykel J Kochenderfer
Paper

We apply the Adaptive Stress Testing (AST) framework to language modeling to identify prompts that are both effective for red-teaming and likely to occur under natural autoregression.

Amelia

Sitemap

Pages

Page Not Found

Archive Layout with Content

Posts by Category

Posts by Collection

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

Blog Post number 4

Blog Post number 3

Blog Post number 2

Blog Post number 1

publications

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

Effective Social Chatbot Strategies for Increasing User Initiative

Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent

Evaluating Human-Language Model Interaction

Inferring Traffic Models in Terminal Airspace from Flight Tracks and Procedures

BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices

More than Marketing? On the Information Value of AI Benchmarks for Practitioners

ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts

talks

Effective Social Chatbot Strategies for Increasing User Initiative

Developing an LLM-Based Cockpit Assistant

ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts

AA228/CS238 Decision Making Under Uncertainty Lecture on Policy Gradient Estimation