pfx-validator Component

pfx-validator Component


Overview

The pfx-validator component validates CSV and XLSX files before they are processed by import routes. It checks structure, headers, line format, and optional schema rules -- failing fast before bad data reaches Pricefx.

URI pattern: pfx-validator:format[?options]


Methods

Method

Description

csv

Validate a CSV file body

xlsx

Validate an XLSX file body


Parameters

Common (CSV and XLSX)

Parameter

Type

Default

Description

checkEmptyLines

boolean

false

Fail if any non-empty lines violate format

onlyPrintWarning

boolean

false

Log validation errors as warnings instead of throwing an exception

readNumberOfLinesOnly

boolean

false

Only count lines, skip content validation

validationSchemaName

String

--

Schema reference for structural validation

hasHeaderRecord

Boolean

true

Whether the input contains a header row

header

String

--

Expected header column names (comma-separated)

skipHeaderRecord

Boolean

false

Skip the header row in output after validation

CSV-specific

Parameter

Type

Default

Description

regularExpression

String

--

Regex each data line must match

formatRef

String

--

Bean ID of a CSVFormat bean in beans/

XLSX-specific

Parameter

Type

Default

Description

sheetIndex

Integer

0

Sheet index to validate (0-based)

sheetName

String

--

Sheet name to validate (overrides sheetIndex)

format

String

xlsx

Output format after validation


Examples

Validate CSV headers before import

XML
<routes xmlns="http://camel.apache.org/schema/spring">
    <route id="importValidatedProducts">
        <from uri="file:{{inbound.path}}?{{archive.file}}&amp;{{read.lock}}"/>
        <to uri="pfx-validator:csv?header=sku,label,price&amp;hasHeaderRecord=true"/>
        <split>
            <tokenize token="&#10;" group="5000"/>
            <to uri="pfx-csv:unmarshal?delimiter=,&amp;skipHeaderRecord=true"/>
            <to uri="pfx-api:loaddata?objectType=P&amp;mapper=productMapper"/>
        </split>
        <onCompletion onCompleteOnly="true">
            <to uri="pfx-api:internalCopy?label=Product"/>
        </onCompletion>
    </route>
</routes>

Validate XLSX by sheet name with warnings only

XML
<to uri="pfx-validator:xlsx?sheetName=Products&amp;header=SKU,Name,Price&amp;onlyPrintWarning=true"/>

Validate CSV with regex pattern

XML
<!-- Each line must have exactly 3 comma-separated fields -->
<to uri="pfx-validator:csv?regularExpression=^[^,]+,[^,]+,[^,]+$&amp;checkEmptyLines=true"/>

JSON Schema Validation (csvlint)

For field-level validation beyond header/format checks, define a JSON schema and reference it with validationSchemaName. The schema is processed by the integration-csvlint module built into IM.

Schema Format

Place the schema JSON file in the resources/ directory:

JSON
{
  "fields": [
    {
      "name": "sku",
      "constraints": {
        "required": true,
        "type": "POSITIVE_INTEGER",
        "minLength": 1,
        "maximum": 999999
      }
    },
    {
      "name": "price",
      "constraints": {
        "required": true,
        "type": "DOUBLE",
        "maximum": 100000
      }
    },
    {
      "name": "validFrom",
      "constraints": {
        "type": "DATE"
      }
    }
  ]
}

Usage

XML
<to uri="pfx-validator:csv?validationSchemaName=product-schema&amp;hasHeaderRecord=true"/>

Supported Constraints

Constraint

Description

Example

required

Field must not be null or empty

"required": true

type

Data type validation

"type": "DOUBLE"

minLength

Minimum string length

"minLength": 3

maxLength

Maximum string length

"maxLength": 100

minimum

Minimum numeric value

"minimum": 0

maximum

Maximum numeric value

"maximum": 999999

pattern

Regex pattern match

"pattern": "[A-Z0-9]+"

Supported Types

Type

Description

POSITIVE_INTEGER

Positive whole number

INTEGER

Any integer (positive or negative)

DOUBLE

Floating-point number

FLOAT

Float-precision number

NUMBER

Generic number

BOOLEAN

Boolean value

DATE

Date value

URL

Valid URL format

Validation Result

When validation fails and onlyPrintWarning=false (default), a ValidationErrorException is thrown with message:
"Validation finished with state ERROR. Errors: {count}, warnings: {count}"

Each error includes: field name, row number, column number, error type, description, and the actual value that failed.


Common Pitfalls

  • If header is specified, the validator checks that the file's first row exactly matches -- column order matters.

  • onlyPrintWarning=true lets the route continue even with invalid files. Use only when partial processing is acceptable.

  • validationSchemaName requires the schema JSON to be in the resources/ directory.

  • For XLSX, either sheetIndex or sheetName selects the sheet. If both are set, sheetName takes precedence.

  • The validator runs before pfx-csv:unmarshal -- it validates the raw file, not parsed records.