4

I am looking for conversation transcripts. These can be from real sources such as radio or TV interviews, or even phone and IM captures, or fictional conversations such as plays and movies.

I don't need the actual audio/video, I am only looking for transcripts at this time. Personally identifying information is unnecessary, aliases or even [SPEAKER 1] would be fine.

I would prefer to avoid mixed narratives like "This American Life".

YouTube could be a good source for interviews that have closed-captioning. In which case I just need the list of videos.

Thanks

Franck Dernoncourt
  • 7,780
  • 9
  • 39
  • 86
ChronoFish
  • 141
  • 3
  • Can movie and tv series sub be ok for you? – user_0 Nov 12 '15 at 15:46
  • Yes - pretty much anything that has dialog in text format. – ChronoFish Nov 12 '15 at 17:52
  • I should clarify a little bit. Closed Captions are fine, but it needs to identify a speaker so that I can follow the context. This become really messy if no speaker is identified and the dialog is more than 2 people. I didn't realize how fragmented movie dialog was till I tried to read the CC without the visual clues - lesson learned! – ChronoFish Nov 18 '15 at 00:02
  • The other piece I should clarify - I am really looking for a database of transcripts. One-offs are nice, but a data source is really ideal. – ChronoFish Nov 18 '15 at 00:03

3 Answers3

3

Open.edu is another good source for conversation transcripts of interviews, although you'll have ascertain them as I do not see a download button:
http://www.open.edu/openlearn/history-the-arts/culture/english-language/example-interview-transcripts

albert
  • 11,885
  • 4
  • 30
  • 57
  • Thank you @albert. I was able to see the example-interview which was perfect! However it was the only one. It was more an example of "how to perform an interview" Vs. a database of transcripts. – ChronoFish Nov 17 '15 at 23:58
1

There is GECO at Stuttgart University (http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/IMS-GECO.html)

It is not exactly open data, but available free of charge for academic research. Consult the cited website on how to get access to GECO.

  • Is the content in English? The source looks to be German. Thanks! – ChronoFish Nov 12 '15 at 17:56
  • @ChronoFish Yes, it contains English and German dialogues. For details see the metadata record here: https://vlo.clarin.eu/record;jsessionid=818D4AFB096E8025EF58C43CE71ED80B?0&q=geco&docId=http://hdl.handle.net/11858/00-247C-0000-0023-512E-7 –  Nov 13 '15 at 07:41
  • Thanks @jknappen. I contacted them and they informed me that all the conversation was in German. So close..... :) – ChronoFish Nov 17 '15 at 23:54
1

I know Italians subs, but there is a lot in English also.
For example: http://www.english-subtitles.pro

I don't know if this is really ok for laws.

user_0
  • 324
  • 1
  • 10
  • Thanks... The data doesn't identify the character who is speaking (by alias or otherwise) so there is no way that I can "follow" the dialog. As you mentioned I'm not sure the legality of it either...so it's not exactly "open" data :) – ChronoFish Nov 17 '15 at 23:56