lextree.h
Go to the documentation of this file.
1 /* -*- c-basic-offset: 4; indent-tabs-mode: nil -*- */
2 /* ====================================================================
3  * Copyright (c) 1999-2004 Carnegie Mellon University. All rights
4  * reserved.
5  *
6  * Redistribution and use in source and binary forms, with or without
7  * modification, are permitted provided that the following conditions
8  * are met:
9  *
10  * 1. Redistributions of source code must retain the above copyright
11  * notice, this list of conditions and the following disclaimer.
12  *
13  * 2. Redistributions in binary form must reproduce the above copyright
14  * notice, this list of conditions and the following disclaimer in
15  * the documentation and/or other materials provided with the
16  * distribution.
17  *
18  * This work was supported in part by funding from the Defense Advanced
19  * Research Projects Agency and the National Science Foundation of the
20  * United States of America, and the CMU Sphinx Speech Consortium.
21  *
22  * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
23  * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
24  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
25  * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
26  * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
27  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
28  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
29  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
30  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
31  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
32  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
33  *
34  * ====================================================================
35  *
36  */
37 /*
38  * lextree.h --
39  *
40  * **********************************************
41  * CMU ARPA Speech Project
42  *
43  * Copyright (c) 1999 Carnegie Mellon University.
44  * ALL RIGHTS RESERVED.
45  * **********************************************
46  *
47  * HISTORY
48  * $Log$
49  * Revision 1.1 2006/04/05 20:27:30 dhdfu
50  * A Great Reorganzation of header files and executables
51  *
52  * Revision 1.11 2006/02/23 15:08:24 arthchan2003
53  * Merged from the branch SPHINX3_5_2_RCI_IRII_BRANCH:
54  *
55  * 1, Fixed memory leaks.
56  * 2, Add logic for full triphone expansion. At this point, the
57  * propagation of scores in word end is still incorrect. So composite
58  * triphone should still be used by default.
59  * 3, Removed lextree_copies_hmm_propagate.
60  *
61  * Revision 1.10.4.6 2005/11/17 06:28:50 arthchan2003
62  * Changed the code to used compressed triphones. Not yet correct at this point
63  *
64  * Revision 1.10.4.5 2005/10/17 04:53:44 arthchan2003
65  * Shrub the trees so that the run-time memory could be controlled.
66  *
67  * Revision 1.10.4.4 2005/10/07 19:34:31 arthchan2003
68  * In full cross-word triphones expansion, the previous implementation has several flaws, e.g, 1, it didn't consider the phone beam on cross word triphones. 2, Also, when the cross word triphone phone is used, children of the last phones will be regarded as cross word triphone. So, the last phone should not be evaluated at all. Last implementation has not safe-guaded that. 3, The rescoring for language model is not done correctly. What we still need to do: a, test the algorithm in more databases. b, implement some speed up schemes.
69  *
70  * Revision 1.10.4.2 2005/09/25 19:23:55 arthchan2003
71  * 1, Added arguments for turning on/off LTS rules. 2, Added arguments for turning on/off composite triphones. 3, Moved dict2pid deallocation back to dict2pid. 4, Tidying up the clean up code.
72  *
73  * Revision 1.10.4.1 2005/06/27 05:37:05 arthchan2003
74  * Incorporated several fixes to the search. 1, If a tree is empty, it will be removed and put back to the pool of tree, so number of trees will not be always increasing. 2, In the previous search, the answer is always "STOP P I T G S B U R G H </s>"and filler words never occurred in the search. The reason is very simple, fillers was not properly propagated in the search at all <**exculamation**> This version fixed this problem. The current search will give <sil> P I T T S B U R G H </sil> </s> to me. This I think it looks much better now.
75  *
76  * Revision 1.10 2005/06/21 23:32:58 arthchan2003
77  * Log. Introduced lextree_init and filler_init to wrap up lextree_build
78  * process. Split the hmm propagation to propagation for leaves and
79  * non-leaves node. This allows an easier time for turning off the
80  * rescoring stage. However, the implementation is not clever enough. One
81  * should split the array to leave array and non-leave array.
82  *
83  * Revision 1.7 2005/06/16 04:59:10 archan
84  * Sphinx3 to s3.generic, a gentle-refactored version of Dave's change in senone scale.
85  *
86  * Revision 1.6 2005/06/13 04:02:59 archan
87  * Fixed most doxygen-style documentation under libs3decoder.
88  *
89  * Revision 1.5 2005/04/25 23:53:35 archan
90  * 1, Some minor modification of vithist_t, vithist_rescore can now support optional LM rescoring, vithist also has its own reporting routine. A new argument -lmrescore is also added in decode and livepretend. This can switch on and off the rescoring procedure. 2, I am reaching the final difficulty of mode 5 implementation. That is, to implement an algorithm which dynamically decide which tree copies should be entered. However, stuffs like score propagation in the leave nodes and non-leaves nodes are already done. 3, As briefly mentioned in 2, implementation of rescoring , which used to happened at leave nodes are now separated. The current implementation is not the most clever one. Wish I have time to change it before check-in to the canonical.
91  *
92  * Revision 1.4 2005/04/25 19:22:47 archan
93  * Refactor out the code of rescoring from lexical tree. Potentially we want to turn off the rescoring if we need.
94  *
95  * Revision 1.3 2005/03/30 01:22:47 archan
96  * Fixed mistakes in last updates. Add
97  *
98  *
99  * 29-Feb-2000 M K Ravishankar (rkm@cs.cmu.edu) at Carnegie Mellon University
100  * Modified some functions to be able to deal with HMMs with any number
101  * of states.
102  *
103  * 07-Jul-1999 M K Ravishankar (rkm@cs.cmu.edu) at Carnegie Mellon University
104  * Added lextree_node_t.ci and lextree_ci_active().
105  *
106  * 30-Apr-1999 M K Ravishankar (rkm@cs.cmu.edu) at Carnegie Mellon University
107  * Started.
108  */
109 
110 
111 #ifndef _S3_LEXTREE_H_
112 #define _S3_LEXTREE_H_
113 
114 #include <stdio.h>
115 
116 #include <bitvec.h>
117 #include <s3types.h>
118 #include <glist.h>
119 #include "kbcore.h"
120 #include "hmm.h"
121 #include "lm.h"
122 #include "vithist.h"
123 #include "ascr.h"
124 #include "fast_algo_struct.h"
125 #include "dict.h"
126 #include "mdef.h"
127 
128 #ifdef __cplusplus
129 extern "C" {
130 #endif
131 #if 0
132 } /* Fool Emacs into not indenting things. */
133 #endif
134 
135 #define LEXTREE_OPERATION_SUCCESS 1
136 #define LEXTREE_OPERATION_FAILURE 0
137 
138 #define LEXTREE_TYPE_FILLER -1
139 #define LEXTREE_TYPE_UNIGRAM 0
140 #define LEXTREE_TYPE_BIGRAM 1
141 #define LEXTREE_TYPE_TRIGRAM 2
142 
143 #if 0 /*Number reserved but not used */
144 #define LEXTREE_TYPE_QUADGRAM 3
145 #define LEXTREE_TYPE_QUINGRAM 4
146 #endif
147 
148 
179 /*
180  * \struct lextree_node_t
181  * One node in a lextree.
182  */
183 typedef struct {
187  glist_t children;
201  int32 wid;
202  int32 prob;
203  int32 ssid;
207  int8 composite;
209 
210 /* Access macros; not meant for arbitrary use */
211 #define lextree_node_wid(n) ((n)->wid)
212 #define lextree_node_prob(n) ((n)->prob)
213 #define lextree_node_ssid(n) ((n)->ssid)
214 #define lextree_node_rc(n) ((n)->rc)
215 #define lextree_node_composite(n) ((n)->composite)
216 #define lextree_node_frame(n) ((n)->frame)
217 
218 
219 /*
220  * \struct lextree_lcroot_t
221  * Root nodes of the lextree valid for a given left context CIphone.
222  */
223 typedef struct {
224  s3cipid_t lc; /* Left context CIphone */
225  glist_t root; /* Its data.ptr are the root nodes (lextree_node_t *) of interest; subset
226  of the entire lextree roots */
228 
229 /*
230  * \struct lextree_t
231  * Entire lextree.
232  */
233 typedef struct {
234  int32 type;
236  glist_t root;
239  int32 n_lc;
240  int32 n_node;
241  int32 n_alloc_node;
250  int32 n_active;
253  int32 best;
254  int32 wbest;
256  char prev_word[100];
257 } lextree_t;
258 
259 /* Access macros; not meant for arbitrary usage */
260 #define lextree_type(l) ((l)->type)
261 #define lextree_root(l) ((l)->root)
262 #define lextree_lcroot(l) ((l)->lcroot)
263 #define lextree_n_lc(l) ((l)->n_lc)
264 #define lextree_n_node(l) ((l)->n_node)
265 #define lextree_n_alloc_node(l) ((l)->n_alloc_node)
266 #define lextree_active(l) ((l)->active)
267 #define lextree_next_active(l) ((l)->next_active)
268 #define lextree_n_active(l) ((l)->n_active)
269 #define lextree_n_next_active(l) ((l)->n_next_active)
270 
271 
276  kbcore_t *kbcore,
277  lm_t* lm,
278  char *lmname,
279  int32 istreeUgProb,
280  int32 bReport,
281  int32 type
282  );
283 
287  kbcore_t *kbcore
288  );
289 
290 
293 void lextree_report(
294  lextree_t *ltree
295  );
296 
303 lextree_t *
304 lextree_build (kbcore_t *kbc,
305  wordprob_t *wordprob,
306  int32 n_word,
307  s3cipid_t *lc,
309  int32 type
310  );
311 
312 /* Free a lextree that was created by lextree_build */
313 void lextree_free (lextree_t *lextree);
314 
315 
320 void lextree_utt_end (lextree_t *l, kbcore_t *kbc);
321 
322 
326 void lextree_enter (lextree_t *lextree,
327  s3cipid_t lc,
328  int32 frame,
329  int32 inscore,
330  int32 inhist,
331  int32 thresh,
333  kbcore_t *kbc
334  );
335 
341 void lextree_active_swap (lextree_t *lextree
342  );
343 
344 
349 void lextree_ssid_active (lextree_t *lextree,
350  uint8 *ssid,
352  uint8 *comssid
354  );
355 
359 void lextree_ci_active (lextree_t *lextree,
360  bitvec_t *ci_active
361  );
362 
368 int32 lextree_hmm_eval (lextree_t *lextree,
369  kbcore_t *kbc,
370  ascr_t *ascr,
371  int32 f,
372  FILE *fp
373  );
374 
375 /*
376  * ARCHAN: Starting from Sphinx 3.6, lextree_hmm_propagate is separated to
377  * lextree_hmm_propagate_non_leaves and lextree_hmm_propagate_leaves. This allows
378  * the rescoring routine to be more easily switched off
379  */
380 
407  kbcore_t *kbc,
408  int32 cf,
409  int32 th,
410  int32 pth,
411  int32 wth,
412  pl_t* pl
413  );
414 
415 
426 int32 lextree_hmm_propagate_leaves (lextree_t *lextree,
428  kbcore_t *kbc,
429  vithist_t *vh,
431  int32 cf,
432  int32 wth
433  );
434 
435 
442 void lextree_hmm_histbin (lextree_t *lextree,
443  int32 bestscr,
444  int32 *bin,
446  int32 nbin,
447  int32 bw
448  );
449 
451 void lextree_dump (lextree_t *lextree,
452  dict_t *dict,
453  mdef_t *mdef,
454  FILE *fp,
455  int32 fmt
456  );
457 
459 int32 num_lextree_links(lextree_t *ltree
460  );
461 #if 0
462 { /* Stop indent from complaining */
463 #endif
464 #ifdef __cplusplus
465 }
466 #endif
467 
468 #endif
hmm_context_t * comctx
Definition: lextree.h:245
strcture for storing the model definition.
Definition: mdef.h:184
File that implement various structure for fast algorithms. fast_algo_struct implement beam_t...
Definition: lextree.h:233
Definition: vithist.h:238
Generic structure that could be used at any n-gram level.
Definition: lm.h:783
int32 num_lextree_links(lextree_t *ltree)
Structure that contains all parameters for phoneme lookahead.
Definition: fast_algo_struct.h:173
int32 n_active
Definition: lextree.h:250
lextree_t * fillertree_init(kbcore_t *kbcore)
int32 type
Definition: lextree.h:234
int32 n_next_active
Definition: lextree.h:251
lextree_node_t ** active
Definition: lextree.h:247
An individual HMM among the HMM search space.
void lextree_enter(lextree_t *lextree, s3cipid_t lc, int32 frame, int32 inscore, int32 inhist, int32 thresh, kbcore_t *kbc)
kb core structures, the structure that stores parameters for s3.X search
void lextree_hmm_histbin(lextree_t *lextree, int32 bestscr, int32 *bin, int32 nbin, int32 bw)
Operations on dictionary.
lextree_node_t ** next_active
Definition: lextree.h:248
int32 prob
Definition: lextree.h:202
lextree_lcroot_t * lcroot
Definition: lextree.h:237
int32 wid
Definition: lextree.h:201
HMM data structure and operation.
glist_t root
Definition: lextree.h:225
int32 lextree_hmm_propagate_non_leaves(lextree_t *lextree, kbcore_t *kbc, int32 cf, int32 th, int32 pth, int32 wth, pl_t *pl)
int8 composite
Definition: lextree.h:207
Definition: ascr.h:99
Definition: lextree.h:183
int32 n_alloc_node
Definition: lextree.h:241
lextree_t * lextree_init(kbcore_t *kbcore, lm_t *lm, char *lmname, int32 istreeUgProb, int32 bReport, int32 type)
void lextree_utt_end(lextree_t *l, kbcore_t *kbc)
int16 s3cipid_t
Definition: s3types.h:110
void lextree_ci_active(lextree_t *lextree, bitvec_t *ci_active)
Definition: lextree.h:223
Viterbi history structures. Mainly vithist_t, also its slightly older brother latticehist_t. They are respectively used by decode (mode 4 and 5) and decode_anytopo (mode 3). The curent arrangement is temporary.
Size definition of semantically units. Common for both s3 and s3.X decoder.
Shared information between a set of HMMs.
int32 n_alloc_blk_sz
Definition: lextree.h:242
void lextree_report(lextree_t *ltree)
a structure for a dictionary.
Definition: dict.h:146
void lextree_ssid_active(lextree_t *lextree, uint8 *ssid, uint8 *comssid)
glist_t root
Definition: lextree.h:236
int32 n_lc
Definition: lextree.h:239
glist_t children
Definition: lextree.h:187
void lextree_dump(lextree_t *lextree, dict_t *dict, mdef_t *mdef, FILE *fp, int32 fmt)
int32 ssid
Definition: lextree.h:203
int32 wbest
Definition: lextree.h:254
Model definition.
hmm_context_t * ctx
Definition: lextree.h:244
s3cipid_t ci
Definition: lextree.h:206
The language model. All unigrams are read into memory on initialization. Bigrams and trigrams read in...
Wrapper to hold senone scores.
s3cipid_t lc
Definition: lextree.h:224
int32 lextree_hmm_eval(lextree_t *lextree, kbcore_t *kbc, ascr_t *ascr, int32 f, FILE *fp)
int32 n_node
Definition: lextree.h:240
hmm_context_t * ctx
Definition: lextree.h:185
void lextree_active_swap(lextree_t *lextree)
Definition: kbcore.h:134
lextree_t * lextree_build(kbcore_t *kbc, wordprob_t *wordprob, int32 n_word, s3cipid_t *lc, int32 type)
Language model.
void lextree_free(lextree_t *lextree)
int32 best
Definition: lextree.h:253
s3cipid_t rc
Definition: lextree.h:204
int32 lextree_hmm_propagate_leaves(lextree_t *lextree, kbcore_t *kbc, vithist_t *vh, int32 cf, int32 wth)
hmm_t hmm
Definition: lextree.h:184