Cornell Movie--Dialogs Corpus

Submitted by on Dec 11 2019 } Suggest Revision
By: Cristian Danescu-Niculescu-Mizil and Lillian Lee
From: Cristian Danescu-Niculescu-Mizil
Resource Type:
Data
License:
Language:
Data Format:
.txt

Description

This corpus contains a large metadata-rich collection of fictional conversations extracted from raw movie scripts: - 220,579 conversational exchanges between 10,292 pairs of movie characters - involves 9,035 characters from 617 movies - in total 304,713 utterances - movie metadata included: - genres - release year - IMDB rating - number of IMDB votes - IMDB rating - character metadata included: - gender (for 3,774 characters) - position on movie credits (3,321 characters) - see README.txt (included) for details
Post comment
Cancel