Language Models¶
-
class
pynlpl.lm.lm.
ARPALanguageModel
(filename, encoding='utf-8', encoder=None, base_e=True, dounknown=True, debug=False, mode='simple')¶ Full back-off language model, loaded from file in ARPA format.
This class does not build the model but allows you to use a pre-computed one. You can use the tool ngram-count from for instance SRILM to actually build the model.
-
class
NgramsProbs
(data, mode='simple', delim=' ')¶ Store Ngrams with their probabilities and backoffs.
This class is used in order to abstract the physical storage layout, and enable memory/speed tradeoffs.
-
backoff
(ngram)¶ Return backoff value of a given ngram tuple
-
prob
(ngram)¶ Return probability of given ngram tuple
-
-
score
(data, history=None)¶
-
scoreword
(word, history=None)¶
-
class
-
class
pynlpl.lm.lm.
SimpleLanguageModel
(n=2, casesensitive=True, beginmarker='<begin>', endmarker='<end>')¶ This is a simple unsmoothed language model. This class can both hold and compute the model.
-
append
(sentence)¶
-
load
(filename)¶
-
save
(filename)¶
-
scoresentence
(sentence)¶
-
-
class
pynlpl.lm.srilm.
SRILM
(filename, n)¶ -
logscore
(ngram)¶
-
scoresentence
(sentence, unknownwordprob=-12)¶
-
-
exception
pynlpl.lm.srilm.
SRILMException
¶ Base Exception for SRILM.
-
class
pynlpl.lm.server.
LMNGramFactory
(lm)¶ -
protocol
¶ alias of
LMNGramProtocol
-
-
class
pynlpl.lm.server.
LMNGramProtocol
¶ -
lineReceived
(ngram)¶ Override this for when each line is received.
@param line: The line which was received with the delimiter removed. @type line: C{bytes}
-
-
class
pynlpl.lm.server.
LMSentenceFactory
(lm)¶ -
protocol
¶ alias of
LMSentenceProtocol
-
-
class
pynlpl.lm.server.
LMSentenceProtocol
¶ -
lineReceived
(sentence)¶ Override this for when each line is received.
@param line: The line which was received with the delimiter removed. @type line: C{bytes}
-
-
class
pynlpl.lm.server.
LMServer
(lm, port=12346, n=0)¶ Language Model Server