|
Processing multiple files |
Pipelines v1.6 |
|
● |
This first method can only be used when the pipeline writes exactly one output record for each input record that is read. The pipeline must not re-order the sequence or alter the number of records in the pipeline, however; you can translate the records. The following example
pipeline illustrates how to specify a pipeline that can process multiple files. pipe (endchar ?) filelist noh .* Generate the list of input files. | specs w5-* 1 .* Discard the file stats. | a: < .* Open and read each file; route the filename to secondary output stream. | specs w2-* 1 .* Discard the record line number. .* ... .* ... .* Do your processing here!. .* You can specify multi-stream input and output intersection labels on stages that route .* the records in and out of this section; as long as they do not alter the number of records .* or their sequence. .* ... .* ... | b: > .* Write the records to the file specified in the secondary input stream. ? a:| take * .* Select all the records. | b: .* Route back to > secondary input stream.
The pipeline works as
follows: Each time the
FILELIST stage finds a file; it writes a record that contains the name of
that file to its primary output stream. The < stage reads this record;
opens the file specified in the record for input and begins reading records.
For each record that < reads; it first writes a record which contains this
same input filename to its secondary output stream and then it writes the
input file record to its primary output stream. You then specify the
stages that you want to operate on the records (as long as they conform to
the constraints as described above). The last stage must write its records to
its primary output stream. Next, the > stage
reads a record from its secondary input stream; which is the record written
by < on its secondary output stream. This record denotes the name of the
file to write to. If > determines that the record contains a filename that
is different from the previous one, > closes the current output file and
opens the new one, as specified in the record. Then > reads a record from
its primary input stream and writes this record to the output file. This process
continues until FILELIST cannot find anymore files and terminates causing the
pipeline to end. |
||
|
● |
Method
2 This second method allows you to process multiple files and to perform translations that alter the number and sequence of the records. You can sort, split, discard records or introduce new ones, however, in order to do this you need to construct two separate pipelines. Consider the
pipelines; list.ppl and format.ppl, below. Each time the
FILELIST stage in the first pipeline: list.ppl,
finds a file; it writes a record that contains the name of that file to its
primary output stream. The SPECS stage reads this record and isolates the
filename, surrounds it with quotation marks (“) and writes the modified
record to its primary output stream. Finally; the RUNPIPE stage reads this
record from its primary input stream and launches the specified pipeline: format.ppl; with the record as its
command-line argument. The second pipeline: format.ppl substitutes the argument
placeholder: &arg1 with its
command-line argument; which is the name of the file to update. list.pplpipe filelist noh | specs /"/ 1 w5-* n /"/ n | runpipe format.ppl format.pplpipe < &arg1 .* ... .* ... .* Do your processing here!. .* You can specify multi-stream input and output intersection labels on stages that route .* the records in and out of this section. .* ... .* ... | > &arg1
A pipeline launched
by the RUNPIPE stage (with the default WAIT operand) always passes back its return
code to the calling pipeline. This allows you to construct a chain of
pipelines that unravel when any one pipeline fails with an error code. |
|
|