Features of Legal Entities

This template enhances your data with standardized values and consolidates similar records into grouped entities.

About the Data Quality Services

The legal entities template includes data quality services for these attributes:

The data quality process examines values for the template's attributes, and adds any resulting validated, standardized values to each record in new enrichment-specific attributes. The original values mapped from your source datasets remain present and unchanged. See the topics linked above for processing details and added attributes.

About the Clustering Model

The legal entities model groups company records as follows:

First, by trusted_id. Records with the same trusted_id are always clustered together. Records with different trusted_ids are never clustered together.

Records with null/empty trusted_id are clustered based on similarity, meaning that they may be clustered with records that have a trusted _id.

Then, by similarity. Records with null or empty trusted_ids are clustered based on similarities between values for these attributes:

  • Company name and alternate company names
  • Legal entity name and alternative legal entity names, provided by the Company enricher
  • Full address
  • Country
  • Website

Note: Generic descriptions, rather than specific attribute names, are listed to represent both the standard schema and the attributes added by the enrichers and other data transformations.

Clustering Examples for Legal Entities

Examples of Source Records Clustered Together

Example 1: These records are clustered together because they have the same or highly similar values for company name and some address fields, but different street addresses, indicating that they most likely represent different sites for the same legal entity.

Column Record 1 Value Record 2 Value
company_name G-W MANAGEMENT SERVICES, LLC G-W MANAGEMENT SERVICES, LLC
address_line_1 11600 NEBEL ST STE 202 5010 NICHOLSON LN STE 200
address_line_2
alternative_names
city ROCKVILLE
country USA: UNITED STATES OF AMERICA USA: UNITED STATES OF AMERICA
phone
postal_code 20852 20852
region MD MD
url
trusted_id

Example 2: These records are clustered together because they have the same trusted_id.

Column Record 1 Value Record 2 Value
company_name LEGAL & GENERAL UCITS ETF MALLINCKRODT
address_line_1 2 GRAND CANAL SQUARE CRUISERATH, BLANCHARDSTOWN DUBLIN 15
address_line_2
alternative_names
city DUBLIN DUBLIN
country IE IE
phone
postal_code D15 TX2V
region
url
trusted_id 3 3

Examples of Source Records Not Clustered Together

Example 1: These records are not clustered together because of missing or dissimilar address and website information, and only moderately similar company names, indicating that they most likely represent different legal entities.

Column Record 1 Value Record 2 Value
company_name BALFOUR BEATTY LLC Balfour Beatty Construction, LLC
address_line_1 SUITE 322 3100 Mckinnon St Fl 10
address_line_2
alternative_names
city WILMINGTON Dallas
country US United States
phone 214-451-1000
postal_code 19805 75201-7007
region Texas
url https://www.balfourbeatty.com/
trusted_id

Example 2: These records are not clustered together because they have different values for trusted_id.

Column Record 1 Value Record 2 Value
company_name AMERICOLD AMERICOLD
address_line_1 10 Glenlake Pkwy Ste. 600 10 Glenlake Pkwy Ste. 600
address_line_2
alternative_names
city Atlanta Atlanta
country US US
phone
postal_code 30328 30328
region Georgia Georgia
url
trusted_id 1 2