pfx-validator Component
Overview
The pfx-validator component validates CSV and XLSX files before they are processed by import routes. It checks structure, headers, line format, and optional schema rules -- failing fast before bad data reaches Pricefx.
URI pattern: pfx-validator:format[?options]
Methods
|
Method |
Description |
|---|---|
|
|
Validate a CSV file body |
|
|
Validate an XLSX file body |
Parameters
Common (CSV and XLSX)
|
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
boolean |
|
Fail if any non-empty lines violate format |
|
|
boolean |
|
Log validation errors as warnings instead of throwing an exception |
|
|
boolean |
|
Only count lines, skip content validation |
|
|
String |
-- |
Schema reference for structural validation |
|
|
Boolean |
|
Whether the input contains a header row |
|
|
String |
-- |
Expected header column names (comma-separated) |
|
|
Boolean |
|
Skip the header row in output after validation |
CSV-specific
|
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
String |
-- |
Regex each data line must match |
|
|
String |
-- |
Bean ID of a |
XLSX-specific
|
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
Integer |
|
Sheet index to validate (0-based) |
|
|
String |
-- |
Sheet name to validate (overrides |
|
|
String |
|
Output format after validation |
Examples
Validate CSV headers before import
<routes xmlns="http://camel.apache.org/schema/spring">
<route id="importValidatedProducts">
<from uri="file:{{inbound.path}}?{{archive.file}}&{{read.lock}}"/>
<to uri="pfx-validator:csv?header=sku,label,price&hasHeaderRecord=true"/>
<split>
<tokenize token=" " group="5000"/>
<to uri="pfx-csv:unmarshal?delimiter=,&skipHeaderRecord=true"/>
<to uri="pfx-api:loaddata?objectType=P&mapper=productMapper"/>
</split>
<onCompletion onCompleteOnly="true">
<to uri="pfx-api:internalCopy?label=Product"/>
</onCompletion>
</route>
</routes>
Validate XLSX by sheet name with warnings only
<to uri="pfx-validator:xlsx?sheetName=Products&header=SKU,Name,Price&onlyPrintWarning=true"/>
Validate CSV with regex pattern
<!-- Each line must have exactly 3 comma-separated fields -->
<to uri="pfx-validator:csv?regularExpression=^[^,]+,[^,]+,[^,]+$&checkEmptyLines=true"/>
JSON Schema Validation (csvlint)
For field-level validation beyond header/format checks, define a JSON schema and reference it with validationSchemaName. The schema is processed by the integration-csvlint module built into IM.
Schema Format
Place the schema JSON file in the resources/ directory:
{
"fields": [
{
"name": "sku",
"constraints": {
"required": true,
"type": "POSITIVE_INTEGER",
"minLength": 1,
"maximum": 999999
}
},
{
"name": "price",
"constraints": {
"required": true,
"type": "DOUBLE",
"maximum": 100000
}
},
{
"name": "validFrom",
"constraints": {
"type": "DATE"
}
}
]
}
Usage
<to uri="pfx-validator:csv?validationSchemaName=product-schema&hasHeaderRecord=true"/>
Supported Constraints
|
Constraint |
Description |
Example |
|---|---|---|
|
|
Field must not be null or empty |
|
|
|
Data type validation |
|
|
|
Minimum string length |
|
|
|
Maximum string length |
|
|
|
Minimum numeric value |
|
|
|
Maximum numeric value |
|
|
|
Regex pattern match |
|
Supported Types
|
Type |
Description |
|---|---|
|
|
Positive whole number |
|
|
Any integer (positive or negative) |
|
|
Floating-point number |
|
|
Float-precision number |
|
|
Generic number |
|
|
Boolean value |
|
|
Date value |
|
|
Valid URL format |
Validation Result
When validation fails and onlyPrintWarning=false (default), a ValidationErrorException is thrown with message:
"Validation finished with state ERROR. Errors: {count}, warnings: {count}"
Each error includes: field name, row number, column number, error type, description, and the actual value that failed.
Common Pitfalls
-
If
headeris specified, the validator checks that the file's first row exactly matches -- column order matters. -
onlyPrintWarning=truelets the route continue even with invalid files. Use only when partial processing is acceptable. -
validationSchemaNamerequires the schema JSON to be in theresources/directory. -
For XLSX, either
sheetIndexorsheetNameselects the sheet. If both are set,sheetNametakes precedence. -
The validator runs before
pfx-csv:unmarshal-- it validates the raw file, not parsed records.