The NPS Chat Corpus is a sample of messages from various online chat platforms tagged by part of speech at the word and passage level. Excerpts from the dataset description:
Description of the NPS Chat Corpus
The NPS Chat Corpus, Release 1.0 consists of 10,567 posts out of
approximately 500,000 posts we have gathered from various online chat
services in accordance with their terms of service. Future releases
will contain more posts from more domains. New releases will be
announced and described at
http://faculty.nps.edu/cmartell/NPSChat.htm.
The posts included in Release 1.0 have been:
1) Hand privacy masked;
2) Part-of-speech tagged; and
3) Dialogue-act tagged.
...
The dialogue-act tags are Accept, Bye, Clarify, Continuer, Emotion,
Emphasis, Greet, No Answer, Other, Reject, Statement, System,
Wh-Question, Yes Answer, Yes/No Question. (See [2] and [3], below.)
Sample Post
Here is a sample post from the corpus:
<Post class="whQuestion" user="11-08-teensUser117">whats balck and white and red all over?<terminals>
<t pos="WP" word="whats"/>
<t pos="^JJ" word="balck"/>
<t pos="CC" word="and"/>
<t pos="JJ" word="white"/>
<t pos="CC" word="and"/>
<t pos="JJ" word="red"/>
<t pos="DT" word="all"/>
<t pos="IN" word="over"/>
<t pos="." word="?"/>
</terminals> </Post>
Citation
Eric N. Forsyth and Craig H. Martell, "Lexical and Discourse Analysis
of Online Chat Dialog," Proceedings of the First IEEE International
Conference on Semantic Computing (ICSC 2007), pp. 19-26, September
2007. http://faculty.nps.edu/cmartell/NPSChat.htm.