Using A Data Warehouse With Qlik Sense – An Ometis Guide
A common topic that creeps up when first engaging with prospects is data warehouses and whether they should use a data warehouse with Qlik Sense.
I’ll be up-front and say you don’t need a data warehouse with Qlik Sense. But that’s not to say you should never use one, as Qlik Sense works perfectly well with them as a data source. Providing the data warehouse with Qlik Sense does not limit a business’s ability to answer all questions quickly and accurately, then I have no argument against them. In fact, there are two main reasons I would suggest using a data warehouse with Qlik Sense:
- Firstly, when the data warehouse already exists and contains all the necessary data, at the lowest-level of granularity and held in a business-ready state. This would enable a quicker ROI (Return On Investment) than starting from scratch. Over time, as the number of data sources increases, or when maintenance costs are due, it would be a worthy exercise to understand the practicality of removing the data warehouse and streamlining the ETL process by allowing Qlik to go direct to source.
- Secondly, when there are multiple business intelligence tools used within an organisation. For larger organisations, the business intelligence tool may be owned by the business, departmentally. Therefore, to allow IT to govern these centrally, a data warehouse makes absolute sense.
On the flip side, a data warehouse absolutely doesn’t make sense when one doesn’t already exist and the business is screaming for questions to be answered yesterday! Or when Qlik is the sole analytical tool and there is no data warehouse already available to tap into, or the data warehouse is poorly designed for true, interactive, data discovery.
As an alternative to a data warehouse, I’d recommend leveraging the benefits of QlikView Documents (QVDs). For more information on QVDs, read my blog post on the ideal Qlik Sense Solution Architecture. For those still interested in reading more, I have summarised below the four areas worth considering when thinking about the need of a data warehouse with Qlik Sense.
Time
If you are considering implementing a data warehouse specifically for Qlik Sense, you may wish to think twice. Time is money, we must never forget that when pricing the cost of a business intelligence project. Yes, organisations price the cost of man-hours to implement a data warehouse but they often forget about the cost of NOT analysing data while the business waits for them to build it. With the average data warehouse implementation taking 1-2 years, that’s a long time to have to wait. Furthermore, by the end of the project the business requirements will most likely have evolved. In comparison, Ometis will on average target a mere 5-10-day sprint to have a collection of requirements delivered in Qlik Sense.
But that’s not all, you must also factor in the time to maintain data structures. Data is fast becoming, if it’s not already, one of a company’s most valuable assets. Data structures will inevitably grow in both width and length, so an organisation will need to be agile enough to make changes quickly with minimal impact to the business. Typically, changes to the structure of a data warehouse takes weeks or months to transition into production, incorporating multiple people of varying roles, we call this the ‘Ask. Wait. Answer. Cycle’. Qlik, on the other hand, is built with agility in mind – needing just one skill set. Providing Qlik Sense has been implemented correctly, businesses of all sizes should be able to benefit from rapid development.
In fact, we work with a large company who implement all changes to their ETL process in Qlik Sense prior to the data warehouse for this very reason – proof that your data warehouse may be slowing you down?! By implementing the changes in Qlik first, it enables their business to move forward in a shorter amount of time. Not to mention, there is no downtime necessary when making changes to a Qlik Sense application or QVD.
This leads me to a less obvious limitation of a data warehouse and Qlik, a disjointed ETL process. This can occur when the time-lag to implement new requirements is considered too great (as mentioned in the example above), or if the data warehouse has a lack of connectivity to certain data sources, such as Excel spreadsheets. By spreading the ETL process across multiple tools you open the business to siloed data transformations, making it difficult to depict the data lineage – what happens to the data and where does it happen. This can cause the reload time to be extended, as the data has to flow through two tools (Data warehouse and Qlik), as opposed to one.
Data
A data warehouse can compromise the flexibility of the data and therefore reduce the potential value of that data. Data warehouses will often aggregate the source data, thus removing the lowest-level of granularity from the transactional data, leaving a consolidated view. For that reason, I often compare a data warehouse to the preview of the latest Game of Thrones series – it has all the best bits but lacks context. You are left asking, ‘What’s going on there and why?’. You’ll never truly be able to answer the ‘Why’ questions with pre-aggregated data.
Additionally, a data warehouse is often designed on preconceived notions which become out-of-date by the time it is finally built. Compared to loading data directly into Qlik, or even using QVDs as the data warehouse/semantic layer, this is often carried out using short and quick iterations. Using a data warehouse with Qlik will also result in unnecessary data duplication. To clarify, the data is stored originally at source, then duplicated in some form in a data warehouse, and potentially again into QVDs (even with a data warehouse QVDs are often used). It’s worth noting QVDs typically have higher compression rates than other data stores, thanks to its use of symbol tables and bit-stuffed pointers. Furthermore, the structure of data in a data warehouses is not usually designed for use with Qlik, and its associative engine. This is counter-intuitive, especially if it’s only intended for Qlik!
Lastly, Qlik’s motto is ‘leave data where it is’ – something I say a lot to prospects if they are considering a data warehouse and Qlik.
Connectivity
One of the key selling points of Qlik Sense is its ability to connect to many disparate sources, it then raises the question of why limit it to one – a data warehouse. Though, despite having a data warehouse, organisations will often still connect Qlik to additional data sources, because, ‘It’s a rogue spreadsheet and it’ll be quicker’, ‘The data warehouse lacks connectivity to that source’ or ‘We’ll have to pay for that connector’. I have spoken to numerous companies who, at the start of their Qlik implementation, insist on everything being in their data warehouse. Fast forward 3, 6, 12 months and Qlik is extracting from 5+ source systems, bypassing the data warehouse for some data sets, and Qlik is responsible for at least 50% of the ETL process.
To date, Qlik can connect to most data sources with ease via 100+ free data connectors. Can the same be said for all data warehouse systems? Possibly, but probably not.
Return On Investment
When purchasing any business intelligence system an organisation will always factor in the return on investment. So, what impacts the ROI of a BI tool? As you may have guessed, it comes down to a string of costs:
- Software licencing costs and ongoing maintenance/subscription.
- Infrastructure costs and ongoing maintenance.
- Skills costs – the cost to train staff and maintain those skills. Knowledge of the software/underlying principles is one point to consider, but never forget the knowledge of how the system has been designed and implemented – they are two very different but equally important pieces of knowledge.
- Periods of analytical downtime that comes with maintenance of infrastructure and software – the cost of not having an operational system during core hours.
Qlik is best served on a standalone, dedicated server. Therefore, if you choose to implement a data warehouse as well, that’s two environments (at least) that must be maintained. Data warehouses will additionally require a separate set of skills to Qlik, the same can be said for licences, periods of downtime, etc. It’s important to note that a Data warehouse has a high cost and time barrier-to-entry, this is the opposite for Qlik; which has a low cost and time barrier-to-entry (Qlik Sense Desktop is currently free and Qlik Sense Enterprise server has a minimum purchase requirement of 1 professional license).
That’s not to say data warehouses don’t offer a ROI but due to long implementation time scales and additional costs of supporting a data warehouse AND a visualisation tool, the ROI is significantly reduced, or at the very least will extend the time it takes to achieve it. If you consider the overlap in capability, due to a QVD layer acting in much the same way, it begs the questions of what does the data warehouse offer that can’t be met by a QVD layer – is a QVD layer good enough? It is interesting to note, Qlik can also store to .txt and .csv formats as well.
Conclusion
We are in the era of data and being able to store data effectively is crucial. More important is the ability to access the entire data catalogue quickly and respond to business questions immediately. A KPI all business intelligence departments should be measured on is how fast they can take a business requirement and have it implemented – TTI (Time To Insights).
The shorter the TTI, the quicker a business can respond. It is therefore important for every business to measure their TTI, and through optimising this KPI, organisations will evidently gain a competitive advantage.
By Chris Lofthouse
Follow @clofthouse89
Comments