Process billions of records with Async SOQL

Christmas time is closer but I would like to deliver this last post before end of the year.

Some months ago, I wrote an entry talking about BigObjects, a feature that was General Available by Winter ’18 and via a use case, I tried to explain it. Now, I would like to follow that post with a new way to create records, Async SOQL.

Async SOQL is half GA by Winter ’18 and half still in Pilot 

The use case talked about moving Code Review custom object records, into Code Review History big objects records, and release storage of custom objects one. We want to Archive records, and Async SOQL will help us to deal with big amounts of data.

What is Async SOQL?

Basically Async SOQL allows you to run SOQL in the background so that, you will get a response after a period of time, but at the same time, helps you to deal with millions or even billions of records without hitting time outs or governor limits.

How can I execute Async SOQL?

Async SOQL is implemented as REST full API and in order to execute this SOQL in the background we need to run a post request:

Captura de pantalla 2017-11-29 a las 21.51.22

And provide a body in JSON format in order to make the call. Find bellow a really easy example that helps you to create Vendor__c custom object records based on Accounts.

Captura de pantalla 2017-12-19 a las 9.40.12What can we highlight?

query: allows you to define a SOQL. On our case, the object I’m going to read, Account and the field I want to recover, Name.

operation: help us to define what we want to do, insert or upsert (please check big objects entry to understand Primary Key and upsert, as it works in the same way)

Basically that is the main difference between simple SOQL and Async SOQL. With this new feature we can read and create or update records in one go.

targetobject: object we want to use to create new records. In our case, Vendor__c.

Then, we need to specify from where we want to get record field values and where we want to store this info. For that we have 2 keys in this JSON code.

targetFieldMapping: allows you to define source field and target fields

and

targetValueMap: allows you to map a target field with a literal value, for instance, we want all records have Spain as Country__c field value.

How is the response?

When we do the post call, we get a similar response:

Captura de pantalla 2017-11-29 a las 22.01.12

It is also in JSON format and basically it provides similar information that you passed in the post call. Just highlight 3 keys:

jobId: remember this is an asynchronous call so there is an Id related to the background job. You can also use it in the post call via the global variable $Job_Id as part of key value on targetValueMap.

status: helps you to know where is the job. Moving from New to Running, Complete, Failed etc.

message: before executing the call, the code is analyzed. If it realizes about a possible issue, it doesn’t start the call and provides an error message in that field. For instance, below message. My VendorName__c field size was too short to store all Account Names.

Captura de pantalla 2017-11-29 a las 22.11.03.png

How can I stop the execution?

Async SOQL doesn’t provide a UI in Salesforce like other background jobs like Batch Apex or Queueable. But we can make an http delete call passing the jobId as part of the url.

Captura de pantalla 2017-11-29 a las 22.22.02

How can I check the progress?

Similar as before, as we do not have UI, we need to look for another way in order to get this information. For that, we have 2 options:

1º – Make a get post call: Similar to the cancel action, if we pass the JobId as part of the url, and do a get call, the response shows you information abut the background execution that is running in the system.

Captura de pantalla 2017-11-29 a las 22.32.33

2º – Make a SOQL against BatckgroundOperation object: This is a new object where we can see extra information about the job execution.

Captura de pantalla 2017-11-29 a las 22.32.42

But this is not the only new object that we can get in the system. We also have BackgroundOperationResult that will help you to identify any issue during the execution if the result is not the expected one. For instance. Bellow image shows an issue related to a field that is required but the source value is empty. In that case, a new error is logged on this table, but the execution doesn’t stop, it continues till the end.

Just keep in mind that this information would be removed after 7 days.

Captura de pantalla 2017-11-29 a las 22.32.55

And what about Archiving?

Yes, you are right. We promised to show you how to archive Code Review records but tried to explain the whole functionality with a really simple use case. Now it’s time to move to Code Review History use case.

As I mentioned before, we would like to create Code Review History records reading Code Review records. Similar as before we will do a post call but the body would be like this one:

Captura de pantalla 2017-11-29 a las 22.51.40

On BigObjects post, the object was called Test1CodeReviewHistory__b instead of CodeReviewHistory__b

And … that’s all. Simple, isn’t it?

Can I integrate this functionality into Apex?

Yes, of course, at the end we are just making http calls, so we only need to set the proper url, add it on my remote settings and create a string in JSON format for the body.

Captura de pantalla 2017-11-29 a las 22.42.57

Summary

Finally I would like to sum up some key concepts we have talked about.

  1. Async SOQL allows you to run SOQL in the background
  2. It takes some times but allows you to process millions or even billions of records
  3. You do not need to worry about governor or time out limitations
  4. It is implemented as REST full API
  5. You can make as many calls as you want per day but just a single one at a time
  6. You can read and create records in one go. Delete is out of scope
  7. This feature is part GA and part still in Pilot:
    1. Read Standard or Custom objects and create Standard, Custom or Big Objects is in Pilot
    2. Read Big Objects and create Standard, Custom or Big Objects is GA

Captura de pantalla 2017-11-29 a las 22.43.44

2 thoughts on “Process billions of records with Async SOQL

  1. Pingback: BigObjects and Triggers – Agustina odeian

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s