pfx-csv:unmarshal | Pricefx Knowledge Base

Overview

Converts CSV text (from the exchange body) into a List<Map<String, String>> (default) or List<List<String>> (when useMaps=false).

Input: The exchange body must be convertible to an InputStream (e.g. String, byte[], File, or InputStream).

Output: The exchange body is replaced with the parsed collection.

Header detection:

If header is set, those names are used.
If header is not set, the first line is parsed as the header.
When used inside a <split>, the header from the first batch (split index 0) is cached and reused for subsequent batches via ExchangeCache.

Encoding: Determined by the CamelCharsetName exchange property. Defaults to UTF-8. BOM (Byte Order Mark) is automatically stripped.

Error reporting: If a parse error occurs inside a split, the error message includes the absolute line number calculated from the split index and split size.

Properties

Parameter	Type	Default	Description
`format`	`String`	`DEFAULT`	Base CSVFormat preset name (e.g. DEFAULT, EXCEL, TDF, RFC4180).
`delimiter`	`String`	`,`	Field delimiter character. Supports Java escape sequences (e.g. \t for tab).
`header`	`String`	(auto-detect)	Comma-separated list of column names. When omitted, the first record is used as the header.
`skipHeaderRecord`	`Boolean`	`false`	Whether to skip the first line when it is a header.
`useMaps`	`Boolean`	`true`	When true, each row becomes a Map. When false, each row becomes a List of values.
`headerPolicy`	NORMAL / STRICT	`NORMAL`	When STRICT, the header parameter is required and must match exactly.
`quoteCharacter`	`Character`	"	Character used to quote field values.
`quoteDisabled`	`Boolean`	`false`	Set to true to disable quoting entirely.
`recordSeparator`	`String`	Platform default	Record (line) separator. Accepts CR, LF, CRLF tokens.
`nullString`	`String`	(none)	String to interpret as null when reading.
`trim`	`Boolean`	(none)	Whether to trim leading/trailing whitespace from field values.
`camelSplitIndexAware`	`Boolean`	`true`	Uses the Camel SPLIT_INDEX property for split-batch behavior: header is cached from the first batch.
`forceSkipHeaderWhenPartOfSplit`	`Boolean`	`true`	When true and exchange is part of a split (index > 0), header line is auto-skipped.
`lazyStartProducer`	`Boolean`	`false`	(Advanced) Whether to defer producer creation until the first message is processed.

Examples

Basic CSV Import (DMDS)

Read a CSV file, split into 5000-line batches, unmarshal, and load into a Pricefx Data Source:

XML

<route id="csvImportToDatasource">
    <from uri="file:{{import.fromUri}}"/>
    <split>
        <tokenize group="5000" token="\n"/>
        <to uri="pfx-csv:unmarshal?header=sku,label,price&amp;skipHeaderRecord=true&amp;delimiter=,"/>
        <to uri="pfx-api:loaddata?mapper=myMapper&amp;objectType=DM&amp;dsUniqueName=Product"/>
    </split>
    <onCompletion onCompleteOnly="true">
        <to uri="pfx-api:flush?dataFeedName=DMF.Product&amp;dataSourceName=DMDS.Product"/>
    </onCompletion>
</route>

Tab-Delimited File

Use a tab delimiter with the \t escape sequence:

XML

<to uri="pfx-csv:unmarshal?delimiter=\t&amp;header=sku,name,price&amp;skipHeaderRecord=true"/>

Pipe-Delimited with Quoting Disabled

XML

<to uri="pfx-csv:unmarshal?delimiter=|&amp;quoteDisabled=true&amp;header=id,name,value"/>

Strict Header Validation

Fail the route if the CSV file header does not exactly match the expected columns:

XML

<to uri="pfx-csv:unmarshal?header=sku,label,price&amp;headerPolicy=STRICT&amp;skipHeaderRecord=true"/>

Unmarshal to Lists Instead of Maps

When map keys are not needed (e.g. positional data):

XML

<to uri="pfx-csv:unmarshal?useMaps=false&amp;header=col1,col2,col3&amp;skipHeaderRecord=true"/>

Setting Encoding Explicitly

XML

<setProperty name="CamelCharsetName">
    <constant>ISO-8859-1</constant>
</setProperty>
<to uri="pfx-csv:unmarshal?header=sku,price&amp;skipHeaderRecord=true"/>

Common Pitfalls

Forgetting skipHeaderRecord=true when a header line exists and the header parameter is also set. Without it, the first data row will be the header line itself.
Header mismatch with headerPolicy=STRICT — the actual header in the CSV must match the header parameter exactly (same order, same case, same number of columns).
Delimiter in XML — Remember to XML-escape the ampersand in URI query strings: use & not &.
Encoding issues — The component reads the CamelCharsetName exchange property for character encoding. If not set, UTF-8 is assumed. BOM is automatically stripped.
Split batching header behavior — When camelSplitIndexAware=true (default), the header is read from the first batch and cached. Set camelSplitIndexAware=false if each split chunk is independent.