April 18, 2024 By Oli Reynolds

Data Extraction using APIs in Qlik Sense

Share:

  • Data Extraction using APIs in Qlik Sense
    5:53

In today's data-driven landscape, businesses are constantly seeking innovative ways to extract actionable insights from a myriad of sources. APIs serve as invaluable tools for accessing external data, and with Qlik Sense's robust REST connector, businesses can seamlessly integrate API data into their analytics workflows.

In this blog, Oli Reynolds, delves into the nuances of extracting data when interfacing with APIs that return multiple tables of information.

Understanding the Qlik Sense REST Connector

The Qlik Sense REST Connector is a versatile tool designed to establish connections to RESTful APIs and retrieve data for analysis. With its flexibility and ease of use, the REST connector enables businesses to tap into the wealth of data available through various APIs and integrate it seamlessly into their analytics processes. Whether it's a proprietary API, a third-party service, or a public data source, the Qlik Sense REST Connector provides the flexibility to connect to a wide range of APIs.

Setting up the connection

The first step in extracting data from any API using the Qlik Sense REST Connector is to configure the connection settings. This involves specifying the API endpoint URL, authentication method, and any required parameters.

Defining the load script

Once the connection is established, users can define a data load script to retrieve and load data from the API into Qlik Sense; this is where we will focus our attention. Many API calls return data that span across multiple tables. If pagination is involved this can become a problem where the generated keys (__KEY*) create incorrect links between tables as the keys restart at one each time the call is made.

Here we have a simple example where a call returns two tables of data for training playthroughs and modules. This API returns 200 rows of data per call and therefore needs to be paginated. The generated key __KEY_playthroughs will be created between 1-200 each time the call is made.

As there is pagination, we will need to loop the call to return all the data from the API. Below is a very simple loop that calls a sub routine to run the script. This loop iterates three times just as a demo, however, using your API documentation you can build a loop that suits the pagination type. You can, for example, run the loop until your variable is null if the pagination returns a ‘next page’ URL, once this comes to the last set of results it will return a blank field. You could also use a timestamp (now()) instead of the loop number to generate a unique key that would avoid having to reference previous reloads.

There are two methods that can be applied to ensure your Keys/Links are correct:

  • Variable and Key generation
  • Joining

Variable & Key generation

If you want to keep the returned tables separated, and there is no direct key from the data itself, you can use the generated _KEY* field. However, you will need to modify this using a variable.

In the loop script above, we are generating a variable ‘vLoopNo’ which increments each time the loop is run,. This is what we will use to generate a new __KEY_Playthroughs field.

As you can see, we concatenate the variable ‘vLoopNo’ a tilde and the original __KEY_playthroughs value to create a new key which will be unique for each run, and when concatenated together will provide the correct link between the playthroughs and modules tables.

Joining

Joining reduces the number of tables in your final data model. This can be useful when there are many more tables returned from the API. Here, we are simply left joining to the playthroughs table on the generated __KEY_playthroughs field, as the join is done for each loop, there is no need to change the key value. The __KEY* values can also be dropped at the end as these are no longer needed. For this method, a new empty table is required to be created for the new joined data to concatenate to after each loop. This needs to be created outside of the loop.

By default, the tables will be generated with the higher level hierarchy tables at the bottom of the script. In our example, playthroughs is loaded after modules. For the join to work, we will need to move the playthroughs to the top of the script. It is worth spending time looking at what is generated from your selected data to make sure the table you want to join to is loaded first.

When to use each method

I prefer the variable method because I’m usually dealing with low cardinality data. The overall script runs faster when I do the join once at the end, leaning on the Qlik engine functionality that indexes the field values to perform that join faster. However, if I have high cardinality data and there isn’t much performance improvement to be had from this method, I tend to use the joining method as it’s easier to maintain the script.

There are times where a hybrid of the two can be used. In the example below we have a large number of tables returned within the Orders call. I have broken the tables into five main tables and, instead of creating a key here, I can utilise the data and use the direct keys that already exist. However, the lower level tables are joined using the __KEY* values. For example, ‘refunds’ has the data from ‘total_duties_set’ and ‘order_adjustments’ joined to it, ‘transactions’ is then its own table with its own set of joined data.

Want to learn more?

Please reach out using the form below if you would like to learn more on this subject or would like assistance in setting up your own APIs.

You can stay up-to-date with all the latest Qlik news, tips and tricks by following Ometis on LinkedIn, Twitter and by subscribing to our YouTube channel.

Comments

About Author

Oli Reynolds

Oli is one of our senior Qlik consultants and has many years’ experience working with the software. He specialises in Qlik Sense app development and enjoys using Qlik’s GeoAnalytic toolkit. He’s also a passionate football fan – ask him about his Fantasy Football Qlik Sense app!

Be the first to hear about new events and updates