To join or not to join?
When you are building a data model in Qlik Sense, sooner or later you will face the decision whether to join, concatenate or keep tables separately with an associated key field. All of the choices might be feasible for the data model to function, but what are the benefits and concerns of the options? And when is it recommended to use each option? This blog post looks at the different script options at hand.
Assessing matching fields and additional fields
Someone once said “the best join is not to join”. Even though you need to be careful when joining tables to avoid duplicating data, the answer is not always that simple. Before making a decision whether to join or not, you first need to consider how many matching fields the tables have. You also need to consider whether you are adding additional columns or just additional rows. Let’s take a closer look what options you have depending on what kind of additional data you want to add to your data model.
One matching field
A join will often make the resulting table larger and it may slow down the reload performance and require more memory. So, if the added table only has a single field in common with the existing table, it could be kept separate in the data model, to avoid impacting the performance. In such situations you may consider using the Keep functionality. The Keep prefix is similar to a Join, but instead of joining the tables, it has the effect of reducing one or both of the two tables to the intersection of the matching data. In other words, you are only keeping the data that has any existing associations.
When the sole purpose of adding a table is to add one complementing field, for example a description field, you rather want to use it as a mapping table and add the additional field by using the ApplyMap function. The advantage with the mapping functions is that they are typically executed faster than joins. Also, mapping tables are automatically dropped after the script execution and will not burden the data model.
Two or more matching fields
If the added table has more than one field in common with the existing table, and all those matching fields are needed to make the correct associations, you might want to consider a join. Ask yourself if the additional fields to be added, beyond the matching fields, are needed to widen the existing table? If the answer is yes, then a join might be necessary.
You can still avoid a join in such cases by replacing the matching fields with a single key field and to keep the added table separated this way. Creating a unique primary key based on multiple fields, also known as a composite key, can simply be done by merging all the fields into one key field using the ampersand ( ‘&’ ) character. However, this may result in a long and complex key. To remove the complexity, the key can be replaced by the Autonumber function to create a more compact key. Replacing a long key with a shorter number means less characters used and more memory spared.
When you have all fields in common between the tables and you merely need to add additional rows by extending the existing table to a taller table, the option is normally to concatenate them on top of each other. Concatenation is different to a join in the sense that the purpose is not to change the existing table structure nor to add more fields, but purely to add more data input to the existing structure. A typical case is when you are extending an existing fact table by adding new data values to the existing fields. If the table structures are identical, Qlik will automatically concatenate the tables for you. It is good practice to add flag fields to separate each source table when you are using the concatenate prefix, in order to make it easier to filter the exact data subset needed in the user interface or in SET expressions. A disadvantage with concatenating several tables with different structures into one single table is that it may consume more memory and slow down the reload time compared to keeping the tables separate in a normalised star schema.
Summary of script options
In this blog post we have discussed the different options you have when adding additional tables to your data model. The summary below illustrates the options depending on the number of matching fields and whether the purpose is to widen or to extend the existing table.
Do you want to know more about data modelling and get hands-on experience in combining tables? Sign up to our Qlik Sense Data Prepare Advanced course or get in touch with the Ometis team today.
Topic: Data analytics