Homespacer>spacerMergemill Tags Guidespacer>spacerContent Insertion Field Tags

spacer

Content Field

Expression

Looping

Branching

System Values

Statistics

Sections

Example

Merge Dataspacer::spacerContent Feedspacer::spacerProcessing Sequencespacer::spacerDynamic Valuesspacer::spacerXojoScriptspacer::spacerField Attributesspacer::spacerNumber Formatspacer::spacerData Offsetspacer::spacerLook-up

spacer

Overview

Mergemill merges static contents in a template with data feeds to generate your desired output. You embed special tags in the template to direct the insertion of contents from feeds you specify.

Content-insertion fields are embedded in the template in the form of the basic tag <?[FieldName]?>. The fields placed between a corresponding pair of <?Loop?> and <?EndLoop?> tags are in-loop fields, and those outside all loop tag pairs are out-loop fields. Field tags may be used to directly insert data values from feeds, or to insert values generated on-the-fly.

When you set up a merge job in Mergemill, you specify an associated template and push a button to parse it. For each new field name found in the template, Mergemill automatically adds a task to the job definition. In the task settings associated with each field, you specify the data column to be fetched, or how the values are to be generated dynamically.

When you run a job, its associated template is executed repeatedly till all primary feeds are exhausted. In each run of the template, Mergemill copies all its static contents to the output, and replaces each field tag with the appropriate data value from its specified source to generate the full output. Depending on what you set the job to do, when Mergemill reaches the end of the template each time, one of several things will happen: the merged text is converted into speech, the merged email is sent, the merged file is saved, the data are used to update an SQL data store, or the data are exported to TSV, CSV or XML files.

Top of Page


Merge Data

A data source provides data feeds for fields in the template. The number of fields specified to fetch data from the same feed determines the number of data columns it contains––one column of data in the feed for each field. Other data columns in the data source are ignored.

A data feed may consist of many consecutive data streams, which may come from the files in a folder, the emails in an inbox, or the web pages on a URL list. For example, a folder of ten files provides a feed comprising ten streams. If you set a task to obtain data from a folder, an email account, or a URL list as a single stream, then the entire feed will be treated as one stream. If the data source is an SQL data store, there is always only one stream in the data feed.

All data streams in a feed contain the same number of data columns, each of which provides a series of data values for one field. Shorter columns in a data stream are always padded with empty values to match the data count of the longest column, so that a data stream always provides the same number of data values in all its columns.

Streams are important in two ways. First, the scope of data sorting is limited to the data stream. You may set multi-level sorting for any data feed. All fields obtaining data from the same feed are sorted together by rows, one stream at a time. Second, loops always break on stream breaks. This is because Mergemill fetches data from feeds one stream at a time, to produce a set of outputs that together exhausts the current data streams for all the in-loop fields in the template. The section on loops explains this process in detail.

When Mergemill parses a template, it creates only one task for each unique field name, and so fields identically named in different loops all use the same column of data values from the same stream. Their uses of the same data column need not be in step, however, because Mergemill manages each loop separately when running the template.

Different fields requesting data from the same column of the same source do not necessarily obtain the same series of data values, because in setting a task you may specify filters to fetch specific contents in the data values. You may also specify the conditions each data value must satisfy to be included.

Top of Page


Content Sources

Mergemill accepts data feeds from diverse sources. Some fields in a job may obtain their data from an ODBC-compliant SQL database, and other fields in the same job may get their data from a local SQLite database, emails, a webpage, webpages on a URL list in a text file, a local text file, a remote text file via FTP, a local folder of text files, or a remote folder of text files via FTP.

A data source may provide structured or non-structured data. Sources providing structured data are delimited text files (commonly tab-separated or comma-separated), XML files, and SQL data stores. Non-structured data come from emails, plain text files, and HTML files.

A plain text file or an email message may provide its entire text content as a single data value, or it may provide non-structured data each bracketed with Mergemill data markers, like [FieldName]...[/]. You may easily extract these wrapped data values using Mergemill's fetch filters. In this case, text outside the markers are ignored by Mergemill.

Top of Page


Processing Sequence from Source to Output

:: Filter Actions

The default is no filter actions, which means all data values are included as they are. If filters are specified, data values are fed through the filters as they are read to decide if they should be included. If a value meets the include conditions, the pieces of text within the data value satisfying the fetch filters will be captured.

If only one field obtains data from a source, or a source provides non-structured data, the include filters specified for a field only affect the data values provided for it. If a source provides structured data for several fields, then a data value rejected for any one of those fields will exclude the entire row of data.

If the fetch filters grow the data count of some columns of structured data, then for the other columns of the same stream you may opt to copy values down to the inserted cells or leave the inserted cells empty. Columns in other feeds are not affected.

If the fetch filters grow the data count of some columns of non-structured data, all other columns are not affected regardless of whether they are in the same feed. All Mergemill does is to append empty values to shorter columns so that the data count is uniform across all columns of the same feed.


:: Other Processings

All search-replace data processing are done AFTER the filter actions and BEFORE the Datafeed End-Processing operations. Below are the twelve Datafeed End-Processings you may apply to the data read. They are set in the task associated with each field.

1.

Trim whitespaces removes leading and trailing whitespaces from data values.

2.

Convert to titlecase converts data values to initial caps.

3.

Convert to uppercase converts data values to uppercase.

4.

Convert to lowercase converts data values to lowercase.

5.

Character to character-references converts all special characters in the data values to their HTML representation, such as © to &copy;.

6.

Skip <, >, ", and & in chr conversion leaves the four symbols untouched when converting special characters to their character references.

7.

Character-references to characters converts all character-references back to the special characters.

8.

Convert to web line breaks preserves the look of line and paragraph breaks of non-HTML data texts in the HTML output, by inserting appropriate <br />, <p>, and </p> tags.

9.

Encode URL components encodes the component separators in a URL, such as www.Test&Try.com to www.test%26try.com.

10.

Decode URL components reverses the above processing.

11.

Encode Base64 encodes arbitrary binary data into a 64-character alphabet composed of the printable ASCII characters. Three bytes of input become four characters of output.

12.

Decode Base64 reverses the above operation to return the original text string from an encoded one.

The last step of feed processing is data sorting. As mentioned earlier, Mergemill allows multi-level sorting, restricts each sort group to fields getting data from the same feed, and limits the scope of data sorting to the current data stream.

space
:: Output

AFTER all the data are read, filtered, processed and sorted, Mergemill runs the template to generate the output. When a field tag is encountered, the appropriate data value is fetched. Multiple occurrences of an in-loop field in the same pass of a loop get the same data value. Occurrences of each out-loop field are treated likewise. The actual text inserted at each place depends on the attributes inside the field tag.

When Mergemill encounters a field tag for which you specified dynamically generated data, i.e. autotext or XojoScript, the data value to be inserted is generated on-the-fly for EACH occurrence of the field. You specify in the associated task of the field the XojoScript to run or how the autotext is generated. The XojoScript is compiled only once for all occurrences of the field for optimum speed.

Top of Page


Dynamic Values in Output Path and Autotext

:: Field Value in Output Path and Output File Name

Mergemill lets you specify how the output path and output file name are built for a merge job, and you may include feed-insertion fields as components. The starting data value for the field is used. For an out-loop field, the starting data value is the first one in the current stream. For an in-loop field, the starting data value is the first value of the field for the current page.

space
:: Source Name and Extension in Output Path, Output File Name, and Autotext

You may specify a source filename or source filename extension as a component in building output path, output file name, or autotext. Mergemill always uses the first out-loop field in the template to determine the source filename and source extension for the current page. If you have multiple sources for the fields, and you want a certain field to set the source filename and extension where other out-loop fields appear earlier in your template, you may put the chosen field FIRST, but between <?HD?> and <?/HD?> to hide it from the output.

space
:: Sequence Number in Output Path, Output File Name, and Autotext

A sequence number component may be included in generating autotext values for a field, in which case you need to provide parameters in the form of Start;Increment in the task settings of the field, where Start is the beginning number and Increment is the step size between successive numbers in the series. For example, the parameters 012;2 give you the sequence 012, 014, 016, and so on till the end of the job.

Mergemill maintains the sequence numbers independently for each loop. If, for instance, <?[SeqNum]?> is an autotext field that is used outside and inside loops, Mergemill will begin them all at the Start value at the beginning of the job. As the loops are run, the sets of SeqNum will be out of step with each other.

The sequence number components in the output file name and output path are similar. The number begins at the Start value when the job begins, and the Increment is added for each subsequent new page, till the job ends. There is an additional option here: You may choose to skip the Start number in the file name or path. This is done by including the Suppress First Number switch (1 or 0) in the parameters: Start;Increment;Suppress. For example, 000;1;1 in the output file name may give you output.htm, output001.htm, output002.htm, and so on.

Top of Page


XojoScript

The XojoScript dynamic data source allows you to execute Xojo BASIC code within Mergemill and assign the result to a Mergemill field tag.

To use it, follow the steps below.

1.

Add scripts to your Mergemill document via the Source: XojoScripts page.

 

*

Use the structure in the Script Example below for your scripts.

 

*

Use the Test Compile and Test Run buttons to check if your script is okay. You may need to first comment out the parameter assignment block, and temporarily assigned some known values to test the script, like
spacer// vPrincipal = Val(Input("Principal"))
spacervPrincipal = 100

 

*

Once your script is tested ok, remove the temporary parameter assignment lines and uncomment those that use the Input function.

2.

Add a field tag to your template where you intend to insert XojoScript result values.

3.

Set up a job that uses the template, and in the task settings of the XojoScript insertion field, do the following:

 

*

Specify XojoScript as the data source.

 

*

Select the script to use from the list of scripts that have been added to the Mergemill document.

 

*

Enter or select values to be passed as parameters to the XojoScript.

Please note that the parameter to be passed could either be a static value or a feed-insertion field to fetch data. If you want to assign a template field as a parameter, just drag-and-drop it from the Template Field column to the appropriate Value or Field Name cell. When the script is run, the current first-column data value of the field (i.e., <?[Field]?> or <?[Field]{1}?>) will be passed to the script.

space
:: Another Productive Use of Your BASIC Skills

The Xojo BAISC language is very similar to that of Microsoft Visual Basic. So you may use your favourite BASIC development tool to develop and test your scripts. Mergemill does not provide any code development facilities like the IDE or debugger, other than the edit-field to simply insert and save your script, and the compiler to translate it into fast machine code.

If you use Xojo to develop your XojoScript, we have included a simple project file XojoScript in the Mergemill software package for you. It contains the recommended script structure in the Open event of the App class. You may develop your script there, and test it by selecting Run on the Project menu.

space
:: Script Example

If you know BASIC well and your code is simple, you may develop and test it in Mergemill. Below is a simple example to calculate compound growth. It shows the four parts of a typical XojoScript.

Part 1 is the variable declaration:
spaceDim vFuture, vPrincipal, vRate, vPeriods as Double
Part 2 is the assignment of parameter values to the variables:
spacevPrincipal = Val(Input("Principal"))
spacevRate = Val(Input("Rate%"))
spacevPeriods = Val(Input("Periods"))
Part 3 is the algorithm:
spacevFuture = vPrincipal * ((100 + vRate) / 100) ^ vPeriods
Part 4 returns the result to the field, to be inserted in the output:
spacePrint Str(vFuture)

Top of Page


Field Attributes

<?[Field]?>

Attributes you may include in the basic field tag are listed below. The field name and the attribute list are separated by a colon, and the attributes are separated by semicolons.

 

OutLoop: instructs Mergemill not to loop the current field which is inside a loop body

 

AWords,nn: returns the first nn words of the data text for the current field

 

ZWords,nn: returns the last nn words of the data text for the current field

 

Left,nn: returns the first nn characters of the data text for the current field

 

Right,nn: returns the last nn characters of the data text for the current field

 

Mid,ss,nn: returns the nn characters of the data text for the current field, beginning at ss

 

UCase: returns the data text for the current field in uppercase

 

LCase: returns the data text for the current field in lowercase

 

TCase: returns the data text for the current field in titlecase, i.e., initial caps

 

Count: returns the total number of data values in the data column of the current stream for the field

 

Position: returns the position number of the current data value in the data column for the field

 

 

Examples:

 

<?[Category:OutLoop]?>

 

<?[Description:AWords,3;ZWords,1]?>

 

<?If([Category:OutLoop;Awords,1] = Automobile)?>

If you need to place an out-loop field inside a loop body, add the OutLoop attribute.

The attributes are applied in order. So AWords,3;ZWords,1 inserts the third word of the data text into the output, and ZWords,3;AWords,1 inserts the third-last word. Also, Position;Count is effectively just Count.

If the attribute parameter nn is missing or less than 1, AWords, ZWords, Left, and Right return the entire field data text unaltered, and Mid includes all characters from the starting position ss to the end of the field data text. If the attribute parameter ss in Mid is missing or less than 1, Mid starts at the first character of the field data text. If both parameters in Mid are missing or less than 1, Mid returns the entire field data text unaltered.

Please note that Position and Count nullifies all other attributes EXCEPT OutLoop.

Top of Page


Number Format

<?[Field]@NumFormat?>

If the value for a field is numerical, you may wish to control its format in the output. The NumFormat specification enables you to do that. It can contain up to three formats separated by semicolons: positive format; negative format; zero format.

Each format is a string of special characters to control how the number will be formatted:
"#" displays a nonzero digit, and nothing for zero
"0" displays a nonzero digit, and 0 for zero
"." displays a decimal point
"," shows a thousand separator
"%" shows a number multiplied by 100
"(" displays an open parentheses
")" displays a closing parentheses
"+" displays a sign to the left of the number, which may be positive or negative
"-" displays a minus sign if the number is negative
"E" or "e" shows the number in scientific notation
"\character" displays the character following the backslash

Examples:

 

12345.6 with ###,##0.00 inserts 12,345.60

 

0.17 with #% inserts 17%

 

12345.6 with #.##e+ inserts 1.23e+4

 

0 with ###,##0.00;(###,##0.00);\z\e\r\o inserts zero

Top of Page


Data Position Offset

<?[Field]{Column}?>
<?[
Field]{Column}@NumFormat?>

Mergemill makes it easy for you to generate multi-column lists on the output page. You may start by adding the loop tag pair to your template to mark out the part to be repeated. Then add the column number to the in-loop field tags. When you run the job, Mergemill manages putting the series of data values from your feeds into the columns of fields.

All field tags have a default column number of 1 if not specified. This is the minimum allowed value. For dynamic fields, the column number––if any––is ignored because each data value is generated on-the-fly.

It is important to remember that the data value for the first-column field is always considered used for the current run of the template or loop. However, for field tags with a data value position offset (i.e. a column number greater than 1), a data value is considered used only when Mergemill acts on the request of such a field tag (or a field operand in an expression tag) to fetch the value. No such action is taken when the field tag appears within a hide section, or is skipped in a branching structure, such as <?If(...)?>...<?Else?><?[Field]{2}?><?EndIf?> where the IF condition returns true. A value is also not considered used if the actions to fetch values are only for carrying out comparisons in the branching tags <?IF(expression...)?>, <?IF(SAME[...)?>, and <?CASE(expression...)?>.

For each subsequent run of the template or loop, Mergemill will continue on with the data value immediately after the one last USED.

Consider the following HTML code segment as an example:

<table width="700" border="0" cellspacing="0" cellpadding="0">
<tr align="left" valign="top">
<td width="100" align="center"><p>Year</p></td>
<td width="300" align="center"><p>Q1 Sales</p></td>
<td width="300" align="center"><p>Q2 Sales</p></td>
</tr>
<?LOOP?><tr align="left" valign="top">
<td width="100"><p><?[Year]{4}?></p></td>
<td width="300"><p><?[Sales]{1}@##,###,##0?></p></td>
<td width="300"><p><?[Sales]{2}@##,###,##0?></p></td>
<?VR:vDummy?><?[Sales]{4}?><?/VR?></tr>
<?ENDLOOP?>
</table>

This template segment above generates a list on a web page that compares the first and second quarter sales figures across several years. The feed is from a CSV text file with contents partly shown below.

space"Year","Quarter","Sales"
space2000,1,1186100
space2000,2,945300
space2000,3,1077200
space2000,4,1683300
space2001,1,1308200
space...

The column number 4 in the Year tag instructs Mergemill to skip three values in each pass of the loop. To keep the columns in step, we need to add a 4th column for Sales as well. Since this value does not need to be inserted but needs to be considered used, we put it in a variable assignment block.

It is important to note that Mergemill treats all out-loop template segments together as one special loop body. Column numbers are handled exactly as in any other loop, and multiple occurrences of a field use the same data value in all out-loop template segments.

Top of Page


Look-up

<?[LookupValueField]([LookupKeyField]=[Field])?>
<?[
LookupValueField]([LookupKeyField]=[Field])@NumFormat?>
<?[
LookupValueField]([LookupKeyField]=[Field]{Column})?>
<?[
LookupValueField]([LookupKeyField]=[Field]{Column})@NumFormat?>

The lookup extension of the field tag instructs Mergemill to take the current value for Field, locate the same value under the LookupKeyField data column, get the corresponding value for LookupValueField, and insert it into the output. All attributes are ignored in the lookup key field and the lookup value field.

A good way to learn how the above tags work is to see them in action. We've included many examples in the Mergemill Pro software package for this purpose: Mergemill Pro > Mergemill Resources > Examples > English.

spacer

Content Field

Expression

Looping

Branching

System Values

Statistics

Sections

Example

spacer

Top of Page

Featuresspacer::spacerDownloadsspacer::spacerBuy Nowspacer::spacerSupportspacer::spacerTutorialsspacer::spacerTags Guidespacer::spacerSite Map


Copyright © 2001-2017 Cross Culture Ltd. All Rights Reserved.