pfx-gp Component

Overview

The pfx-gp component loads bulk data into a Greenplum database via SSH/SFTP. It uploads the CSV body to the Greenplum master host and uses gpload to perform MERGE, UPDATE, or INSERT operations. Use it for high-throughput database synchronisation from Pricefx to Greenplum.

URI pattern: pfx-gp:loaddata[?options]


Methods

Method

Description

loaddata

Bulk-load the CSV body into a Greenplum table via gpload


Parameters

Connection (required)

Parameter

Type

Default

Description

masterHost

String

Greenplum master host IP or hostname

masterPort

Integer

Greenplum master port

database

String

Target database name

dbUser

String

Database user

dbPassword

String

Database password — use {{property}} placeholder

sshHost

String

SSH/SFTP host for file upload

sshPort

int

22

SSH port

sshUsername

String

SSH username

sshPassword

String

SSH password — use {{property}} placeholder

sshDirectory

String

/

Directory on SSH host for staging files

Load target

Parameter

Type

Default

Description

table

String

Required. Target Greenplum table name

mode

String

MERGE

Load mode: MERGE, UPDATE, or INSERT

matchColumns

List

Columns used to match existing rows (MERGE/UPDATE)

updateColumns

List

Columns to update on match

updateCondition

String

Custom SQL condition for update

truncate

boolean

false

Truncate table before loading

Format

Parameter

Type

Default

Description

format

String

CSV

Input format (CSV, TEXT)

delimiter

String

,

Field delimiter

escape

String

\\

Escape character

quote

String

"

Quote character

header

boolean

false

Input includes a header row

forceNotNull

List

Columns that must not be null

errorLimit

int

0

Number of tolerated errors before aborting

logErrors

boolean

false

Log rejected rows instead of aborting

portFrom

int

8000

Start of gpload port range

portTo

int

9000

End of gpload port range

reuseTables

boolean

false

Reuse gpload staging tables across runs


Example

MERGE from Pricefx export to Greenplum

XML
<routes xmlns="http://camel.apache.org/schema/spring">
    <route id="exportToGreenplum">
        <from uri="timer://sync?repeatCount=1"/>
        <to uri="pfx-api:fetch?objectType=PX&amp;filter=pricelistFilter"/>
        <to uri="pfx-csv:marshal?delimiter=,"/>
        <to uri="pfx-gp:loaddata
            ?masterHost={{gp.host}}
            &amp;masterPort={{gp.port}}
            &amp;database={{gp.database}}
            &amp;dbUser={{gp.user}}
            &amp;dbPassword={{gp.password}}
            &amp;sshHost={{gp.ssh.host}}
            &amp;sshUsername={{gp.ssh.user}}
            &amp;sshPassword={{gp.ssh.password}}
            &amp;table=pricelist_items
            &amp;mode=MERGE
            &amp;matchColumns=sku,pricelist_id
            &amp;header=true"/>
    </route>
</routes>

Common Pitfalls

  • Never hardcode credentials — always use {{property}} placeholders for dbPassword and sshPassword.

  • matchColumns is required for MERGE and UPDATE modes — omitting it causes all rows to be treated as inserts.

  • Greenplum gpload requires network access from the IM instance to both the masterHost (database port) and the sshHost (SSH port). Verify firewall rules before deployment.

  • portFrom/portTo define the port range used by gpload for parallel loading — ensure this range is open between Greenplum segments and the IM instance.

  • Set header=true if your marshalled CSV includes a header row so gpload skips it during load.