Language Models

class pynlpl.lm.lm.ARPALanguageModel(filename, encoding='utf-8', encoder=None, base_e=True, dounknown=True, debug=False, mode='simple')

Full back-off language model, loaded from file in ARPA format.

This class does not build the model but allows you to use a pre-computed one. You can use the tool ngram-count from for instance SRILM to actually build the model.

class NgramsProbs(data, mode='simple', delim=' ')

Store Ngrams with their probabilities and backoffs.

This class is used in order to abstract the physical storage layout, and enable memory/speed tradeoffs.

backoff(ngram)

Return backoff value of a given ngram tuple

prob(ngram)

Return probability of given ngram tuple

ARPALanguageModel.score(data, history=None)
ARPALanguageModel.scoreword(word, history=None)
class pynlpl.lm.lm.SimpleLanguageModel(n=2, casesensitive=True, beginmarker='<begin>', endmarker='<end>')

This is a simple unsmoothed language model. This class can both hold and compute the model.

append(sentence)
load(filename)
save(filename)
scoresentence(sentence)
class pynlpl.lm.server.LMNGramFactory(lm)
protocol

alias of LMNGramProtocol

class pynlpl.lm.server.LMNGramProtocol
lineReceived(ngram)
class pynlpl.lm.server.LMSentenceFactory(lm)
protocol

alias of LMSentenceProtocol

class pynlpl.lm.server.LMSentenceProtocol
lineReceived(sentence)
class pynlpl.lm.server.LMServer(lm, port=12346, n=0)

Language Model Server

class pynlpl.lm.client.LMClient(host='localhost', port=12346, n=0)
scoresentence(sentence)