Blog

Salesforce Data Skew

Skew denotes an incorrect occurrence, and data skew refers to a data set's uneven distribution of the data. Salesforce uses the term "Data Skew" to describe data that is not consistently assigned, has more than 10,000 child records linked to the same parent record, or is held by only one person.

Types of Data Skew :

  1. Account Data Skew.
  2. Lookup Data Skew.
  3. Ownership Data Skew.

Account Data Skew :

When too many child records are linked to the same parent account record, account data skew results.

For example :  When more than 10,000 contacts are linked to the same account, account data skew occurs.

Impact of Account Data Skew :

  • Record Locking Issue :
    • When a Contact or other associated "child" record is being modified, a temporary hold or "lock" is applied to the parent record. This guarantees data integrity and makes it possible for Salesforce to execute background updates like running validation rules and automation. Normally, you wouldn't even notice this, but if you are updating a lot of records that have the same parent, this temporary lock may prevent updates from succeeding.
  • Sharing Issue :
    • Let's assume that you have a sharing model where contact access is private and you have a lot of sharing restrictions to make sure that sales users only have access to the right contacts. Changes to the sharing rules require a recalculation if there are many Contacts (more than 10,000) associated with the same Account, which could impact customer performance.

Considerations to Avoid Account Data Skew :

  • The distribution of child records over multiple accounts rather than collection on a single record is the only way to prevent Account skew.
  • The organisation is secured against performance reductions caused by account skew by having an equal balance of child records across parent accounts.

Lookup Data Skew :

Account skew and lookup skew are similar, however, lookup skew can impact more objects. This occurs when a lot of records in the lookup object are linked to a single parent record.

Example :

  • Let's consider a salesforce organization that handles customer issues on mobile networks. They have several departments like Postpaid, Prepaid, Internet connections, and Recharge sections. Whenever the issues are raised, they will create a case against the particular department(i.e Account) in Salesforce.
  • What would occur if there were to be more than 10,000 case records associated with a single Account Record?
  • They would begin to experience some of the signs of lookup data skew, such as records being temporarily "locked" during each update, which might have an impact on system performance or result in the failure of significant update operations.

Considerations to Avoid Lookup Data Skew :

  • Increasing the performance of the Synchronous apex and avoiding using workflow automation and utilizing the Asynchronous apex effectively.
  • Reducing the Child records associated with a Single parent record by creating an additional lookup field.
  • Utilizing the Picklist fields when the lookup values are low. This will reduce the Lookup Skew.

Ownership Data Skew :

Another sort of date skew, known as ownership data skew, is highly prevalent in Salesforce. This problem arises when a single Salesforce user owns more than 10,000 records.

Every record in Salesforce needs to have an owner, so it's typical practice in businesses to create a default owner or queue where all unassigned or unneeded records are sent.

Example :

  • Let's consider a user has more than 10,000 case records under his/her name. If the user’s role has been changed in the role hierarchy, all the sharing rules are recalculated for all the records owned by him/her as the role changes and it will result in long-running operations.

This will result in performance issues and sharing calculations when the owner changes or the owner role changes.

Considerations to Avoid Ownership Data Skew :

  • Make sure that Every user doesn’t have more than 10,000 records.
  • Make sure the Skewed user(record owner) doesn’t have a role.
  • If a role is required for a user to exchange data, we advise that you: At the top of the hierarchy, assign them to a distinct position and not remove them from that high-level position. 
  • Keep them away from public groups where sharing rules might be found.