Cornell Movie--Dialogs Corpus

Submitted by

on Dec 11 2019 } Suggest Revision

By: Cristian Danescu-Niculescu-Mizil and Lillian Lee

Project: http://www.cs.cornell.edu/~cristian/Corn...

From: Cristian Danescu-Niculescu-Mizil

Paper: http://www.cs.cornell.edu/~cristian/Cham...

Summary
Comments (0)

Resource Type:

Data

License:

Language:

Data Format:

.txt

Description

This corpus contains a large metadata-rich collection of fictional conversations extracted from raw movie scripts: - 220,579 conversational exchanges between 10,292 pairs of movie characters - involves 9,035 characters from 617 movies - in total 304,713 utterances - movie metadata included: - genres - release year - IMDB rating - number of IMDB votes - IMDB rating - character metadata included: - gender (for 3,774 characters) - position on movie credits (3,321 characters) - see README.txt (included) for details

Categorized in: Machine Learning | Natural Language | Discourse & Dialogue

Post comment

Cancel