pfx-excel:unmarshal

Transforms the given Excel file into internal structure.

  • Input: InputStream — the Excel file content (XLS or XLSX).

  • Output: List<Map<String, String>> — a list of rows, where each row is a map of column name to value.

The format of the input (XLS vs XLSX) is automatically detected using Apache POI's FileMagic (stream-safe detection via mark/reset since PFIMCORE-2862).

Properties

Option

Type

Default

Since

Description

hasHeaderRecord

Boolean

true

IM 1.1.18

Indicates whether the input contains a header record (must be on the first row).

skipHeaderRecord

Boolean

false

IM 1.1.18

Determines whether to skip the header record in the output.

header

String


IM 1.1.18

Comma-separated list of headers to use. When hasHeaderRecord=false, this defines the column names for the output maps. When hasHeaderRecord=true, this filters which columns appear in the output.

sheetIndex

Integer

0

IM 1.1.18

Index of the sheet with the required data.

sheetName

String


IM 1.1.18

Name of the sheet with the required data. If filled, takes precedence over sheetIndex.

Numeric Value Handling (since PFIMCORE-2881, March 2026)

Cell values are extracted with the following rules:

Cell Type

Handling

STRING

Returned as-is

BOOLEAN

Converted to "true" / "false"

NUMERIC (whole number)

Converted to long if within Long.MIN_VALUE to Long.MAX_VALUE range

NUMERIC (decimal)

Formatted using BigDecimal.toPlainString() to avoid scientific notation (e.g., 0.000178495 instead of 1.78495E-4)

NUMERIC (NaN / Infinity)

Returned as "NaN" / "Infinity"

NUMERIC (date)

Formatted as date string

BLANK / OTHER

Empty string ""

Examples

Transform Excel file into internal representation (default)

XML
<to uri="pfx-excel:unmarshal"/>

Transform Excel file without header

XML
<to uri="pfx-excel:unmarshal?header=sku,name&amp;hasHeaderRecord=false"/>

Read from a specific sheet by name

XML
<to uri="pfx-excel:unmarshal?sheetName=Products"/>

Common Pitfalls

  • All values return as strings — Numeric cells are converted to string representation. If you need typed values in Pricefx, use converters in your mapper (e.g., stringToDecimal, stringToInteger).

  • Header row consumed — When hasHeaderRecord=true (default), the first row is used as column names and is not included in the output data. If your file has no header, set hasHeaderRecord=false and provide header=col1,col2,....

  • Multi-sheet files — By default only sheet index 0 is read. Use sheetName or sheetIndex to target a different sheet. There is no built-in way to read all sheets in one call — create separate routes or use Groovy for multi-sheet processing.

  • Large filesunmarshal loads the entire file into memory. For files with 10k+ rows, use streamingUnmarshal instead.