Data Requirements (Optimization - Product Similarity)

When using the Product Similarity Accelerator, it is essential to understand the data required to enable the module’s functionality and produce accurate, meaningful similarity results.

Data can be loaded from either a Data Source or a Datamart.

Product Master, Product Extensions or Company Parameter tables are not supported; all product information should be loaded in a Data Source.

Product Data Scope

These fields are necessary for the basic functioning of the system. Without them, Product Similarity cannot perform its core operations.

Field

Required?

Comment

Product ID

Yes

Unique product identifier (e.g., ProductID). It ensures that each product is treated as a distinct entity.

Product Name (Text/Descriptive)

Yes

Descriptive field used for text-based similarity computation (e.g., ProductName). Limited to 255 characters.

Other Text/Descriptive Attributes

No

Additional free-text product descriptors or short descriptions. Improve text-based model accuracy.

Categorical Attributes

At least one required

Select attributes that represent discrete product groupings (e.g., Brand, Industry, Product Group, Sub Product Group, Packaging Type, Product Type). Used for clustering and filtering. Fields such as hierarchical descriptors or product categories can provide valuable context for grouping products.

Numerical Attributes to Sum

No

Numeric measures aggregated at the product level (e.g., BaseQty). Used to weight similarity if applicable.

Numerical Attributes to Average

No

Continuous numeric attributes (e.g., Medium Price). Used for threshold calculations or normalization. Numerical specifications, such as size, power or unit price. For aggregation, data that can be either summed up or averaged.

Add Price Delta Threshold

No

Activates comparison based on the Percent Delta Threshold defined below.

Price Attribute for Threshold

Required when Add Price Delta Threshold is enabled

Select a numeric, product-level price measure (for example, MediumPrice, or other consistent price attribute). Use a field with a single currency and unit across the dataset.

Percent Delta Threshold

No

Percentage deviation (e.g., 20%) used to limit similarity to comparable price ranges.

Data Source Input

Yes

Defines the input table or Datamart used for the product data. Example: Product [DS].

Data Source Filter

No

Recommended to restrict scope (e.g., ProductID starts with “P-”*). Prevents inclusion of archived or test records.

Transaction Data Scope

Field

Required?

Comment

Use Transaction Source Data (checkbox)

No

if selected, the fields below are required.

Data Source Input (Transactional)

Yes

Source of transactional data.

Data Source Filter (Transactional)

No

Filter to define time range or data subset (e.g., “DateYear = 2024”). You might want to limit the scope of analysis to specific time frames, e.g., the last two years.

Product ID (Transactional)

Yes

Must match the ProductID from the product data, so both tables can be joined.

Text Attributes (Transactional)

No

Transaction-level text fields for descriptions or notes. Can include product names or other descriptions that provide more context for each transaction, with the limitation of 255 characters.

Categorical Attributes (Transactional)

No

Fields such as OrderType or SalesChannel; useful for behavior-based similarity. Similar to product data, this could be hierarchical descriptors or transaction categories.

Numerical Attributes to Sum / Avg (Transactional)

No

Aggregated measures (e.g., Quantity, Revenue). Help weight similarities based on activity volume or sales. Data that can provide context for each transaction, such as transaction amount, quantity, or any other relevant metric.

These fields, while not strictly mandatory, significantly enhance the system's outputs, offering richer insights to define product similarities. Most of the business value of this Product Similarity Accelerator comes from the right product attributes for your business, which could be really diverse depending on the industry, so take a moment and list what makes a product specific for your own business. Here are some examples:

Attribute

Comment

Brand

Manufacturer or brand name associated with the product.

Packaging

Product packaging or unit type (e.g., bottle, box, bag, etc.).

Size / Dimensions

Physical size or dimensions (height, width, depth, volume).

Weight

Product weight; used in comparison for physical goods.

Color

Relevant for visually distinctive items or retail products.

Material

Key for durable goods, textiles, or manufactured products.

Specification Features

Technical attributes (e.g., storage capacity, speed, resolution).

Power Source

Electrical or mechanical energy source details.

Certifications / Labels

Standards or quality certifications relevant to the product.

Country of Origin

Country or region where the product is manufactured.

Lifecycle Stage

Indicates whether a product is Active, New, Discontinued, etc.

By fulfilling the mandatory data requirements and supplementing them with the optional fields, you can maximize the value of Product Similarity Accelerator. It is advisable to provide as many relevant fields as possible to ensure nuanced, accurate, and comprehensive results.

Language Support

For text attributes, only the following 15 languages are supported: Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish.