NLTK Cheat Sheet

I am currently reading Natural Language Processing with Python, an excellent introduction into Computational Linguistics. Below a list of technical terms:

Bigram = “In the”, “the beginning”, “beginning god” “god created” “created earth”
Lexical Entry
= Lemma + Gloss = Headword + Sense definition
Wordlist Corpora = (sophisticated) wordlist
Stopwords = high frequency words such as to, in, a.
Swadesh wordlists = Lists ca. 200 common words

