Bulk Query

Introduction

In this blog, I am going show how to use the Salesforce Bulk API query features to fetch a large amount of data from the salesforce.Use the bulk query to efficiently query large data sets and reduce the number of API requests. A bulk query can retrieve up to 15 GB of data, divided into 15 1-GB files. The data formats supported are CSV, XML, and JSON.When a bulk query is processed, Salesforce attempts to execute the query. If the query doesn’t execute within the standard 2-minute timeout limit, the job fails and a QUERY_TIMEOUT error is returned. In this case, rewrite a simpler query and resubmit the batch. If the query succeeds, Salesforce attempts to retrieve the results. If the results exceed the 1-GB file size limit or take longer than 10 minutes to retrieve, the completed results are cached and another attempt is made. After 15 attempts, the job fails and the error message “Retried more than fifteen times” is returned. In this case, consider using the PK Chunking header to split the query results into smaller chunks. If the attempts succeed, the results are returned and stored for seven days. The following flowchart depicts how bulk queries are processed.

 

Walk Through a Bulk Query Sample

Step 1: Log In Using the SOAP API

The Bulk API doesn’t provide a login operation, so you must use SOAP API to log in.Create a text file called login.txt containing the following text and replace the username and password

Run the following command from the cURL command line

Which will return the an XML response that includes <sessionId> and <serverUrl> elements. We are going to use the sessionId for the subsequent request authentication.

Step 2.Create a Job

Create a file called create-job.xml containing the following text.

Using a command-line window, execute the following cURL command to create a job. Replace the session Id which you got in step 1.

You will get the response which will return job information.

Step 3: Add a Batch to the Job

Create a file called query.txt to contain the SOQL query statement. Bulk Query will fetch the data based on this Query

Using a command-line window, execute the following cURL command to add a batch to the job. Replace JobId which you hot in step 2

You will get the batch Id and batch information as response shown below.

Step 4: Check the Status of the Job and Batch

Using a command-line window, execute the following cURL command to check the job status.

You will get the below response along with the job status

Using a command-line window, execute the following cURL command to check the batch status.

Which will return the batch status

Step 5: Retrieve the Results

Using the command-line window, execute the following cURL command to retrieve the batch result list.

Salesforce returns an XML response with data such as the following.

Using the command-line window, execute the following cURL command to retrieve the results of the query in the result.csv file .

Step 6: Close the Job

Create a file called close-job.xml containing the following text.

Using a command-line window, execute the following cURL command to close the job.

Salesforce returns an XML response with data such as the following.

 

 

Salesforce Bulk API

In this post, I am going to show how to use Salesforce Bulk API which provides a programmatic option to quickly load your org’s data into Salesforce.
With the Bulk API, You process a set of records by creating a job that contains one or more batches. The job specifies which object is being processed and what type of operation is being used. A batch is a set of records sent to the server to process and Each batch is processed independently by the server, not necessarily in the order it is received. Batches may be processed in parallel or concurrent.  Here are the details steps to load the data with bulk API

Step 1: Log In Using the SOAP API

The Bulk API doesn’t provide a login operation, so you must use SOAP API to log in.Create a text file called login.txt containing the following text and replace the username and password.

Run the following command from the cURL command line.

Which will return the an XML response that includes <sessionId> and <serverUrl> elements. We are going to use the sessionId for the subsequent  request authentication.

Step 2: Create a Job

Before you can load the data, you need to create a job. The job specifies the type of object, such as Contact, that you’re loading and the operation that you’re performing, such as query, queryAll, insert, update, upsert, or delete.  Create a text file called job.txt containing the following text.

Now Run the following command from the cURL command line

You will get the response as shown below. In this response, you will get the jobId which we will use in subsequence calls.

 

Step 3: Add a Batch to the Job

After creating the job, you need to create the batch that contains the data to be loaded into the salesforce.  Create a CSV file that contains the data. Using a command-line window, execute the following cURL command.

Which will return the batchId as shown below.

Step 4: Close the Job

When you’re finished submitting batches to Salesforce, close the job. This informs Salesforce that no more batches will be submitted for the job, which, in turn, allows the monitoring page in Salesforce to return more meaningful statistics on the progress of the job.Create a text file called close_job.txt containing the following text.

Run the following command from cURL

Which will return the batch details and update the job resource state from Open to Closed.

Step 5: Check Batch Status

You can check the status of an individual batch by running the following cURL command.

which will return the following XML response with the status like Failed or completed.

Step 6: Retrieve Batch Results

Once a batch is Completed, you need to retrieve the batch result to see the status of individual records. Retrieve the results for an individual batch by running the following cURL command:

Salesforce returns a response with data such as the following

 

 

 

Salesforce Bulk API 2.0

Introduction

In this blog, I am going to show how to use Salesforce Bulk API 2.0.Bulk API 2.0 is generally available in Winter ’18.Bulk API v2 brings the power of bulk transactions from Bulk API v1 into a simplified, easier-to-use REST API. Bulk API v2 lets you create, update, or delete millions of records asynchronously by using REST API.
Here are the features of the Salesforce Bulk API 2.0

  • Bulk API 2.0 supports all OAuth flows supported by Salesforce’s other REST APIs.
  • Bulk API 2.0 simplifies uploading large amounts of data by breaking the data into batches automatically. All you have to do is upload a CSV file with your record data and check back when the results are ready.
  • Instead of limiting the amount of data uploaded daily by the number of jobs, Bulk API 2.0 uses a limit of total records uploaded. The limit is 100 million records per 24-hour period.
  • Bulk API v2 does away with the need to manually break up data into batches.

Data Processing 

Salesforce processes Bulk API 2.0 jobs you create and uses job state to describe the status of a queued job.Salesforce queues a new job for processing once you’ve created it, uploaded data for the job, and moved the job to the UploadComplete state. Once the job is de-queued, Salesforce starts processing the job operation and moves the job to the InProgress state. Once the job has been fully processed, Salesforce moves the job to the JobComplete state, or Failed state,
depending on if the job was successfully processed or not.Salesforce automatically chunks the data for your job into multiple internal batches to improve performance while processing the job.Salesforce creates a separate batch for every 10,000 records in your job data, up to a daily maximum of 100 million records. If the limit is exceeded while processing your job data, the remaining data isn’t processed.
For each internal batch, if Salesforce can’t process all the records in a batch in 10 minutes or less, the batch fails. Salesforce will re-try failed batches up to a maximum of 10 retries. If the batch still can’t be processed after 10 retries, the entire job is moved to the Failed state and remaining job data is not processed. Use the Failed Record Results and Unprocessed Record Records resources to determine
what records were not processed and what errors occurred.

Bulk API v2.0 Example 

Bulk API 2.0 data processing contains the below steps

  1. Authenticate with REST API. We are using the workbench here
  2. Create a job.
  3. Upload data for the job.
  4. close the job by setting the job state to UploadComplete. Salesforce will start processing the job at this point.
  5. Check the status of the job.
  6. Retrieve the results of the completed job by checking the successful results, failed results, and unprocessed records.

1.Login into workbench
go to https://workbench.developerforce.com/ link and login into the workbench with Salesforce
2.Create a job
To do any Bulk API 2.0 task, such as inserting or updating records, you first create a job. The job specifies the type of object, such as 
Account, that you’re loading. The job also specifies the operation you’re performing, such as insert or delete.In workbench  Go to Utilities > REST Explorer and create a job by issuing a POST request with the following details as shown below

Once you execute the REST Call, You should get a response that includes the job ID, with a job state of Open. You’ll use the job ID in Bulk API 2.0 calls in the next
steps. You’ll also use the URL in the contentUrl field of the response in the next step when you upload your data.

3.Upload data for the job

After creating a job, you’re ready to upload your data. You provide record data using the CSV file you created earlier.In workbench’s REST Explorer, use Bulk API 2.0 to create a job data request with the CSV data. Issue a PUT request with the following
details. For the Request Body, copy and paste the CSV data into workbench’s Request Body text field. With Bulk API 2.0, you can submit CSV
data that does not in total exceed 150 MB in size (after base64 encoding).

4. Close the job.

Once you’re done submitting data, you can inform Salesforce that the job is ready for processing by closing the job. In workbench’s REST Explorer, issue a PATCH request with the following details.

5. Check the job status and results.

To get basic status information on a job, such as the overall job state or the number of records processed, use a GET request with the following details:

6.Retrieve the results 

Once a job has been completed and is in the JobComplete state (or Failed state), you can get details of which job data records were successfully processed by issuing a GET request with the following details:

You get a response that contains CSV data, with each row containing a record ID and information on whether that record was successfully processed or not. To get details on records that encountered an error during processing, use a GET request using the failedResults resource instead of the successfulResults resource.