Author: Sidhanth Mohapatro
Sometimes, we face a challenge where we have to fetch a huge number of records from the database. This can be done with a simple select query if you have a limited number of records, but what if the record size is in lakhs or ten thousands. It still can be done with a simple select query, but the problem is that your mule application might run out of memory space. Also, this is not an appropriate design since there are more chances it can harm your application performance and it can be non-reliable.
Therefore, to solve the above problem statement we use the concept of pagination. Instead of retrieving records in huge numbers at a time, we take help of limit and offset values to retrieve records in limited size and process until all the records are fetched from the database.
Note: This tutorial also explains how to give dynamic queries to databases. This is needed when the query parameters are changing during run time.
Note: LIMIT key field is applicable to oracle and snowflake databases. If you are using sql database, please use “Rows Fetch Next” key in place of LIMIT.
Step 1: call the database and retrieve the total no of records.
Select count(*) from employee_db;
Note: You can use transform message and put the query in string as above to dynamically query from a database.
Step 2: using set variable initialize offset and limit values i.e., offset: 0 and limit: 500.
Note: *Limit is the maximum number of records you want to fetch at a time from the database. The value depends upon the requirement. *Offset is the value that helps to fetch data from that point of record. i.e., if offset=100, then it will fetch records from 100th line.
Step 3: Based on the total record count, we are generating random arrays. The size of arrays depends upon the limit.
Ex: if total record count=2000 and limit=500
Then, 2000/500 = 4. Hence, we are forming random 4 arrays which will iterate 4 times to fetch records from the database using limit.
i.e. arr = [1,2,3,4] -> this will be passed on to each component, which will loop 4 times and fetch 500 records on every iteration.
Dataweave logic below:
%dw 2.0 output application/java var rowcount = payload."COUNT(*)" reduce ($$ ++ $) as Number var maxSize = ceil(rowcount/vars.limit) fun prepareList(list:Array, maxSize: Number) = if(sizeOf(list) >= maxSize ) list else prepareList(list ++ [(sizeOf(list) + 1) as Number],maxSize) --- prepareList(,maxSize) as Array
Step 4: query used to fetch records
select * from emp_db LIMIT : vars.limit OFFSET : vars.offset
Step 5: Inside for each, we have set an offset variable after calling the database. Once we fetch data from the database, we are setting a new offset value. This will help to fetch the next 500 records from the 500th record.
set variable logic: vars.offset + vars.limit
Pagination is one of the best approaches whenever you want to pass a large set of data into chunks without harming the application performance. The above method can be implemented with respect to any connector wherever a large set of data is being processed, unless you want to go for batch processing.