Skip to content

Latest commit

 

History

History
22 lines (17 loc) · 906 Bytes

homework-5.md

File metadata and controls

22 lines (17 loc) · 906 Bytes

Part-of-speech tagging

This homework is about part-of-speech tagging. Your data will be from Twitter. Select three English-language tweets that you’d like to work with.

Using the annotation framework described in this paper, label the part-of-speech tags of each word in your three tweets.

Next, using the Penn Treebank part-of-speech tags, annotate the fine-grained POS tag of each verb in your data (e.g., VBD, VBZ, MD). You do not need to do the non-verbs.

Finally, using the Brown part-of-speech tagset, annotate the fine-grained POS tag of each verb in one sentence of your data (e.g., MD*, MD+HV, VBG+TO). Try to choose a sentence that involves a tag that is not covered in the previous two tagsets.