NLTK Cheat Sheet

I am currently reading Natural Language Processing with Python, an excellent introduction into Computational Linguistics. Below a list of technical terms:

Bigram = “In the”, “the beginning”, “beginning god” “god created” “created earth”
Lexical Entry
= Lemma + Gloss = Headword + Sense definition
Wordlist Corpora = (sophisticated) wordlist
Stopwords = high frequency words such as to, in, a.
Swadesh wordlists = Lists ca. 200 common words

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>