There are two common ways to train the probabilities of the unknown word
model . The first one is to turn the problem back into a closed vocabulary one by choosing a fixed vocabulary in advance: 1. Choose a vocabulary (word list) that is fixed in advance; 2. Convert in the training set any word that is not in this set (any OOV word) to the unknown word token in a text normalization step; 3. Estimate the probabilities for from its counts just like any other regular word in the training set.
The second alternative, in situations where we don’t have a prior vocabulary in advance, is to create such a vocabulary implicitly, replacing words in the training data by based on their frequency.