Simply put, a lemma is the base form of a word, typically found in a dictionary where it’s known as a headword.
From the lemma we can form many related words. So, for example, here are a few lemmas and their forms:
do: do, does, did, doing
run: running, ran
fruit: fruits, fruity, fruitful
Interestingly, the 10 most common lemmas make up about 25% of all words we use!
Lemmas in Linguistics
In morphology a lemma is the standard form of a lexeme. The lexeme refers to the set of all the forms that have the same meaning, and the lemma refers to the particular form that is chosen by convention to represent the lexeme.
In lexicography, a lemma is usually the headword by which it is indexed. In a dictionary, the lemma go represents the inflected forms go, goes, going, went, and gone. The relationship between an inflected form and its lemma is usually denoted by an angle bracket: “went” < “go”.
Lemmas are used often in corpus linguistics for determining word frequency.
Note that a lemma is derived from the root of a word and that sometimes a root will give more than one lemma.
Forming the Lemma
In English, the headword of a noun is the singular: e.g., mouse rather than mice; fruit rather than fruits.
For multi-word lexemes which contain possessive adjectives or reflexive pronouns, the headword uses a form of the indefinite pronoun one: e.g. do one’s best, perjure oneself.
With verbs a lemma is usually either the infinitive or the present tense in the first person singular.