PocketSphinx  0.6
src/libpocketsphinx/acmod.h File Reference

Acoustic model structures for PocketSphinx. More...

#include <stdio.h>
#include <sphinxbase/cmd_ln.h>
#include <sphinxbase/logmath.h>
#include <sphinxbase/fe.h>
#include <sphinxbase/feat.h>
#include <sphinxbase/bitvec.h>
#include <sphinxbase/err.h>
#include "ps_mllr.h"
#include "bin_mdef.h"
#include "tmat.h"
#include "hmm.h"

Go to the source code of this file.

Data Structures

struct  ps_mllr_s
 Feature space linear transform structure. More...
struct  ps_mgaufuncs_s
struct  ps_mgau_s
struct  acmod_s
 Acoustic model structure. More...

Defines

#define SENSCR_DUMMY   0x7fff
 Dummy senone score value for unintentionally active states.
#define ps_mgau_base(mg)   ((ps_mgau_t *)(mg))
#define ps_mgau_frame_eval(mg, senscr, senone_active, n_senone_active, feat, frame, compallsen)
#define ps_mgau_transform(mg, mllr)   (*ps_mgau_base(mg)->vt->transform)(mg, mllr)
#define ps_mgau_free(mg)   (*ps_mgau_base(mg)->vt->free)(mg)
#define acmod_activate_sen(acmod, sen)   bitvec_set((acmod)->senone_active_vec, sen)
 Activate a single senone.

Typedefs

typedef enum acmod_state_e acmod_state_t
 States in utterance processing.
typedef struct ps_mgau_s ps_mgau_t
 Acoustic model parameter structure.
typedef struct ps_mgaufuncs_s ps_mgaufuncs_t
typedef struct acmod_s acmod_t

Enumerations

enum  acmod_state_e { ACMOD_IDLE, ACMOD_STARTED, ACMOD_PROCESSING, ACMOD_ENDED }
 States in utterance processing. More...

Functions

acmod_tacmod_init (cmd_ln_t *config, logmath_t *lmath, fe_t *fe, feat_t *fcb)
 Initialize an acoustic model.
ps_mllr_tacmod_update_mllr (acmod_t *acmod, ps_mllr_t *mllr)
 Adapt acoustic model using a linear transform.
int acmod_set_senfh (acmod_t *acmod, FILE *senfh)
 Start logging senone scores to a filehandle.
int acmod_set_mfcfh (acmod_t *acmod, FILE *logfh)
 Start logging MFCCs to a filehandle.
int acmod_set_rawfh (acmod_t *acmod, FILE *logfh)
 Start logging raw audio to a filehandle.
void acmod_free (acmod_t *acmod)
 Finalize an acoustic model.
int acmod_start_utt (acmod_t *acmod)
 Mark the start of an utterance.
int acmod_end_utt (acmod_t *acmod)
 Mark the end of an utterance.
int acmod_rewind (acmod_t *acmod)
 Rewind the current utterance, allowing it to be rescored.
int acmod_advance (acmod_t *acmod)
 Advance the frame index.
int acmod_set_grow (acmod_t *acmod, int grow_feat)
 Set memory allocation policy for utterance processing.
int acmod_process_raw (acmod_t *acmod, int16 const **inout_raw, size_t *inout_n_samps, int full_utt)
 TODO: Set queue length for utterance processing.
int acmod_process_cep (acmod_t *acmod, mfcc_t ***inout_cep, int *inout_n_frames, int full_utt)
 Feed acoustic feature data into the acoustic model for scoring.
int acmod_process_feat (acmod_t *acmod, mfcc_t **feat)
 Feed dynamic feature data into the acoustic model for scoring.
int acmod_set_insenfh (acmod_t *acmod, FILE *insenfh)
 Set up a senone score dump file for input.
int acmod_read_scores (acmod_t *acmod)
 Read one frame of scores from senone score dump file.
mfcc_t ** acmod_get_frame (acmod_t *acmod, int *inout_frame_idx)
 Get a frame of dynamic feature data.
int16 const * acmod_score (acmod_t *acmod, int *inout_frame_idx)
 Score one frame of data.
int acmod_write_senfh_header (acmod_t *acmod, FILE *logfh)
 Write senone dump file header.
int acmod_write_scores (acmod_t *acmod, int n_active, uint8 const *active, int16 const *senscr, FILE *senfh)
 Write a frame of senone scores to a dump file.
int acmod_best_score (acmod_t *acmod, int *out_best_senid)
 Get best score and senone index for current frame.
void acmod_clear_active (acmod_t *acmod)
 Clear set of active senones.
void acmod_activate_hmm (acmod_t *acmod, hmm_t *hmm)
 Activate senones associated with an HMM.
int32 acmod_flags2list (acmod_t *acmod)
 Build active list from.

Detailed Description

Acoustic model structures for PocketSphinx.

Author:
David Huggins-Daines <dhuggins@cs.cmu.edu>

Definition in file acmod.h.


Define Documentation

#define ps_mgau_frame_eval (   mg,
  senscr,
  senone_active,
  n_senone_active,
  feat,
  frame,
  compallsen 
)
Value:
(*ps_mgau_base(mg)->vt->frame_eval)                                 \
    (mg, senscr, senone_active, n_senone_active, feat, frame, compallsen)

Definition at line 118 of file acmod.h.


Enumeration Type Documentation

States in utterance processing.

Enumerator:
ACMOD_IDLE 

Not in an utterance.

ACMOD_STARTED 

Utterance started, no data yet.

ACMOD_PROCESSING 

Utterance in progress.

ACMOD_ENDED 

Utterance ended, still buffering.

Definition at line 66 of file acmod.h.


Function Documentation

int acmod_advance ( acmod_t acmod)

Advance the frame index.

This function moves to the next frame of input data. Subsequent calls to acmod_score() will return scores for that frame, until the next call to acmod_advance().

Returns:
New frame index.

Definition at line 880 of file acmod.c.

References acmod_s::feat_outidx, ps_mgau_s::frame_idx, acmod_s::mgau, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, and acmod_s::output_frame.

mfcc_t** acmod_get_frame ( acmod_t acmod,
int *  inout_frame_idx 
)

Get a frame of dynamic feature data.

Parameters:
inout_frame_idxInput: frame index to get, or NULL to obtain features for the most recent frame. Output: frame index corresponding to this set of features.
Returns:
Feature array, or NULL if requested frame is not available.

Definition at line 1064 of file acmod.c.

References acmod_s::feat_buf.

acmod_t* acmod_init ( cmd_ln_t *  config,
logmath_t *  lmath,
fe_t *  fe,
feat_t *  fcb 
)

Initialize an acoustic model.

Parameters:
configa command-line object containing parameters. This pointer is not retained by this object.
lmathglobal log-math parameters.
fea previously-initialized acoustic feature module to use, or NULL to create one automatically. If this is supplied and its parameters do not match those in the acoustic model, this function will fail. This pointer is not retained.
fea previously-initialized dynamic feature module to use, or NULL to create one automatically. If this is supplied and its parameters do not match those in the acoustic model, this function will fail. This pointer is not retained.
Returns:
a newly initialized acmod_t, or NULL on failure.

Definition at line 229 of file acmod.c.

References acmod_free(), ACMOD_IDLE, acmod_s::compallsen, acmod_s::config, acmod_s::fcb, acmod_s::fe, acmod_s::feat_buf, acmod_s::framepos, acmod_s::lmath, acmod_s::log_zero, acmod_s::mdef, acmod_s::mfc_buf, acmod_s::n_feat_alloc, acmod_s::n_mfc_alloc, acmod_s::senone_active, acmod_s::senone_active_vec, acmod_s::senone_scores, and acmod_s::state.

Referenced by ps_reinit().

int acmod_process_cep ( acmod_t acmod,
mfcc_t ***  inout_cep,
int *  inout_n_frames,
int  full_utt 
)

Feed acoustic feature data into the acoustic model for scoring.

Parameters:
inout_cepIn: Pointer to buffer of features Out: Pointer to next frame to be read
inout_n_framesIn: Number of frames available Out: Number of frames remaining
full_uttIf non-zero, this block represents a full utterance and should be processed as such.
Returns:
Number of frames of data processed.

Definition at line 689 of file acmod.c.

References ACMOD_ENDED, ACMOD_PROCESSING, ACMOD_STARTED, acmod_s::fcb, acmod_s::feat_buf, acmod_s::feat_outidx, acmod_s::grow_feat, acmod_s::mfcfh, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, and acmod_s::state.

int acmod_process_feat ( acmod_t acmod,
mfcc_t **  feat 
)

Feed dynamic feature data into the acoustic model for scoring.

Unlike acmod_process_raw() and acmod_process_cep(), this function accepts a single frame at a time. This is because there is no need to do buffering when using dynamic features as input. However, if the dynamic feature buffer is full, this function will fail, so you should either always check the return value, or always pair a call to it with a call to acmod_score().

Parameters:
featPointer to one frame of dynamic features.
Returns:
Number of frames processed (either 0 or 1).

Definition at line 781 of file acmod.c.

References acmod_s::fcb, acmod_s::feat_buf, acmod_s::feat_outidx, acmod_s::grow_feat, acmod_s::n_feat_alloc, and acmod_s::n_feat_frame.

int acmod_process_raw ( acmod_t acmod,
int16 const **  inout_raw,
size_t *  inout_n_samps,
int  full_utt 
)

TODO: Set queue length for utterance processing.

This function allows multiple concurrent passes of search to operate on different parts of the utterance. Feed raw audio data to the acoustic model for scoring.

Parameters:
inout_rawIn: Pointer to buffer of raw samples Out: Pointer to next sample to be read
inout_n_sampsIn: Number of samples available Out: Number of samples remaining
full_uttIf non-zero, this block represents a full utterance and should be processed as such.
Returns:
Number of frames of data processed.

Definition at line 620 of file acmod.c.

References acmod_s::fe, acmod_s::mfc_buf, acmod_s::mfc_outidx, acmod_s::n_mfc_alloc, acmod_s::n_mfc_frame, and acmod_s::rawfh.

Referenced by ps_process_raw().

int acmod_read_scores ( acmod_t acmod)

Read one frame of scores from senone score dump file.

Returns:
Number of frames read or <0 on error.

Definition at line 991 of file acmod.c.

References acmod_s::feat_outidx, acmod_s::framepos, acmod_s::grow_feat, acmod_s::insenfh, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, acmod_s::n_senone_active, acmod_s::output_frame, and acmod_s::senscr_frame.

Referenced by ps_decode_senscr().

int acmod_rewind ( acmod_t acmod)

Rewind the current utterance, allowing it to be rescored.

After calling this function, the internal frame index is reset, and acmod_score() will return scores starting at the first frame of the current utterance. Currently, acmod_set_grow() must have been called to enable growing the feature buffer in order for this to work. In the future, senone scores may be cached instead.

Returns:
0 for success, <0 for failure (if the utterance can't be rewound due to no feature or score data available)

Definition at line 858 of file acmod.c.

References acmod_s::feat_outidx, ps_mgau_s::frame_idx, acmod_s::mgau, acmod_s::n_feat_alloc, acmod_s::n_feat_frame, acmod_s::output_frame, and acmod_s::senscr_frame.

int16 const* acmod_score ( acmod_t acmod,
int *  inout_frame_idx 
)

Score one frame of data.

Parameters:
inout_frame_idxInput: frame index to score, or NULL to obtain scores for the most recent frame. Output: frame index corresponding to this set of scores.
Returns:
Array of senone scores for this frame, or NULL if no frame is available for scoring (such as if a frame index is requested that is not yet or no longer available). The data pointed to persists only until the next call to acmod_score() or acmod_advance().

Definition at line 1082 of file acmod.c.

References acmod_flags2list(), acmod_write_scores(), acmod_s::compallsen, acmod_s::feat_buf, acmod_s::framepos, acmod_s::insenfh, acmod_s::mgau, acmod_s::n_senone_active, acmod_s::senfh, acmod_s::senone_active, acmod_s::senone_scores, and acmod_s::senscr_frame.

Referenced by ngram_fwdflat_search(), and ngram_fwdtree_search().

int acmod_set_grow ( acmod_t acmod,
int  grow_feat 
)

Set memory allocation policy for utterance processing.

Parameters:
grow_featIf non-zero, the internal dynamic feature buffer will expand as necessary to encompass any amount of data fed to the model.
Returns:
previous allocation policy.

Definition at line 421 of file acmod.c.

References acmod_s::grow_feat, and acmod_s::n_feat_alloc.

Referenced by ps_process_raw(), and ps_reinit().

int acmod_set_insenfh ( acmod_t acmod,
FILE *  insenfh 
)

Set up a senone score dump file for input.

Parameters:
insenfhFile handle of dump file
Returns:
0 for success, <0 for failure

Definition at line 845 of file acmod.c.

References acmod_s::compallsen, acmod_s::config, acmod_s::insenfh, and acmod_s::n_feat_frame.

Referenced by ps_decode_senscr().

int acmod_set_mfcfh ( acmod_t acmod,
FILE *  logfh 
)

Start logging MFCCs to a filehandle.

Parameters:
acmodAcoustic model object.
logfhFilehandle to log to.
Returns:
0 for success, <0 on error.

Definition at line 381 of file acmod.c.

References acmod_s::mfcfh.

Referenced by ps_start_utt().

int acmod_set_rawfh ( acmod_t acmod,
FILE *  logfh 
)

Start logging raw audio to a filehandle.

Parameters:
acmodAcoustic model object.
logfhFilehandle to log to.
Returns:
0 for success, <0 on error.

Definition at line 393 of file acmod.c.

References acmod_s::rawfh.

Referenced by ps_start_utt().

int acmod_set_senfh ( acmod_t acmod,
FILE *  senfh 
)

Start logging senone scores to a filehandle.

Parameters:
acmodAcoustic model object.
logfhFilehandle to log to.
Returns:
0 for success, <0 on error.

Definition at line 370 of file acmod.c.

References acmod_write_senfh_header(), and acmod_s::senfh.

Referenced by ps_start_utt().

ps_mllr_t* acmod_update_mllr ( acmod_t acmod,
ps_mllr_t mllr 
)

Adapt acoustic model using a linear transform.

Parameters:
mllrThe new transform to use, or NULL to update the existing transform. The decoder retains ownership of this pointer, so you should not attempt to free it manually. Use ps_mllr_retain() if you wish to reuse it elsewhere.
Returns:
The updated transform object for this decoder, or NULL on failure.

Definition at line 345 of file acmod.c.

References acmod_s::mgau, acmod_s::mllr, and ps_mllr_free().

Referenced by ps_update_mllr().