File
A source file may be either fixed width or delimited; which to use is selected from the combobox at the top.
A delimited file has a delimiter, and optionally a qualifier; the delimiter separates input fields, the qualifier surrounds fields. Hence, a line like one|two|three has the | character as delimiter but no qualifier; "one"|"two"|"three" has the same delimiter but also " as the qualifier.
You can optionaly specify a copy list, which specifies columns to be copied. By default, all columns are copied, but you can specify lists like 1,5,6,7 (meaning four columns - 1, 5, 6 and 7), or 1-4,7-9 (seven columns - 1, 2, 3, 4, 7, 8 and 9). Columns can appear more than once, for instance 1-3,1.
In a fixed width file, fields occupy specified columns (eg., field1 occupies columns 0 through 12, field2 occupies 13 through 20 and so on). Fields are specified in the appropriate area of the dialog; each can be given a name, which is just used as a comment. Fields do not have to occupy contiguous ranges of columns, and need not include all the columns in the file (indeed, fields may overlap, though you will be warned about his). The Set from table button can be used to choose a server database and table on which to base the set of fields.
The copier can optionally treat the first line as a header line, in which case the first line is read and parsed to give a set of column names. If a fixed width file has a header, then the column names in the header override the names set in the list of fields.
The other setting (at the bottom, next to the file) controls error behaviour. Ignore excess means that extraneous fields are ignored (this only applies to delimited files); Skip means that any line with too few or too many fields is skipped; and Abort causes the copy to be aborted if there are too many or too few fields.
TableA table source specifies a server database, a table in that server database and one or more fields from the table; additionally, arbitrary SQL expressions can be added.
Optionally, SQL where and order by expressions can be specified, to select only certain rows, and to order the rows.
XMLThe source copier can parse a limited range of XML format files. Specifically, the file must have a main document tag; immediately inside this are row elements. Data values are either attributes on the row elements, or elements within the row element. An example is shown below.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE username> <username> <row> <us_usernameid>42</us_usernameid> <us_username>MIKE RICHARDSON</us_username> </row> <row> <us_usernameid>66</us_usernameid> <us_username>JOH DEAN</us_username> </row> </username> |
The main document tag and the row tag are set in the copier. In addition, you should specify each data value to be extracted from each row of data in the XML file. As each XML row is processed, attribute names and data element tags are matched against the entries in the field list. Hence, there is no ordering requirement on attributes or data values in the XML; unrecognised attributes or values are silently ignored. If an entry is not matched then it will be set to the null value. Note that there must be no XML elements nested within the data values.
Note that XML parsing is not very efficient, so using an XML source copier is not recommended for very large files.
QueryThis is like a table source, except that the data is taken from a Rekall query.
Arbitrary SQLThe allows an arbitrary SQL select query to be given; the only other setting being the server database.