How to use Qlik Talend Cloud Cross Pipeline Projects

Written by Ryan Peachey | Mar 24, 2025 11:19:05 AM

What are Cross Pipeline Projects and why they matter

Qlik Talend Cloud (QTC) recently introduced an important feature: Cross Pipeline Projects. This addition significantly improves how data engineers build and share pipeline architectures, making data initiatives more efficient and collaborative.

This practical guide will show you:

How to set up Cross Pipeline Projects in QTC
A step-by-step implementation with Github and Snowflake
Ways to enable your team to use shared data in their own pipelines

The limitation of traditional QTC pipelines

Before Cross Pipeline Projects, QTC users faced a significant constraint: data processed in one project couldn't be easily used in another without re-storing and re-registering that data. This created unnecessary duplication and made collaboration between data engineering teams difficult.

For organisations with multiple data sources and teams, this limitation forced inefficient workflows and slowed development.

How to set up Cross Pipeline Projects in QTC

Let's explore a straightforward implementation connecting Github data to Snowflake using QTC's Cross Pipeline Projects feature.

Prerequisites

For this example, I've already:

Set up connections in QTC
Created two Snowflake databases: OMETIS_LANDING and OMETIS_PIPELINES

Data landingoptions in QTC

There are 2 ways to land data using QTC:

Via replicate project
Via pipeline projects

For ingest, you would typically use a replicate project, but this method doesn't allow us to use the output in other QTC projects without re-storing & re-registering the data. So we will be using a pipeline project for this example.

Project 1: Creating your data landing pipeline

The initial project focuses on extracting and landing data:

Connection setup: Configure the 'OmetisLtd Github' block as your Github connection
Data landing task: Create the 'Land Github data' task to select and extract data from Github
Data storage: Land this data in Snowflake within the OMETIS_LANDING database in a 'GITHUB' schema
Naming convention: Prefix landing tables with 'LANDING_' for clarity (e.g., LANDING_TEAMS, LANDING_USERS)

The crucial onboarding task

The 'Onboarding Task' is where Cross Pipeline Projects truly shows its value:

Create QTC storage assets on your landed data tables
Store these in the OMETIS_LANDING database's GITHUB schema (without prefixes)
This step makes your data available to other projects without re-registration

The onboarding process automatically creates several Snowflake objects:

Base tables (TEAMS and USERS)
Change tracking tables (__current and __ct suffixes)
Three views per table for different querying needs

------------------------------------------------------------------------------------

Project 2: Using shared data in transformation pipelines

Here's where the practical benefit becomes clear. In a separate project:

Reference your landing storage pipeline (appears as a 'grey' block labeled "Github Ingestion (Via Pipeline)" with "Onboarding_Storage")
Use the referenced data as input for transformation tasks
Land transformed output in a dedicated database (OMETIS_PIPELINES)

The transform output within the OMETIS_PIPELINES database in the GITHUB schema produces:

2 Tables (because the author chose to materialize in the QTC settings)
3 Views (the main output view, as well as changes and with deleted records)

As shown in the screenshot, this creates:

GITHUB_TRANSFORM_1_ct and GITHUB_TRANSFORM_1_current tables
GITHUB_TRANSFORM_1, GITHUB_TRANSFORM_1_changes, and GITHUB_TRANSFORM_1_whdr views

Key advantages of this approach

This Cross Pipeline Projects architecture provides several benefits:

Better orchestration: Schedule transformation tasks to start immediately after storage tasks complete
Less duplication: No need to re-register or re-store data between projects
Visible data lineage: Maintain clarity of data flow across different pipelines
Better collaboration: Multiple teams can build on the same foundational data assets

Implementation results: what you'll see in Snowflake

After implementation, your Snowflake instance will contain (as shown in the screenshot):

In the OMETIS_LANDING database with GITHUB schema:

Landing tables: LANDING_TEAMS and LANDING_USERS
Change tracking tables: TEAMS_ct, TEAMS_current, USERS_ct, USERS_current
Views for querying: TEAMS, TEAMS_CHANGES, TEAMS_whdr, USERS, USERS_CHANGES, USERS_whdr

In the OMETIS_PIPELINES database with GITHUB schema:

Materialised transform tables: GITHUB_TRANSFORM_1_ct, GITHUB_TRANSFORM_1_current
Transform views: GITHUB_TRANSFORM_1, GITHUB_TRANSFORM_1_changes, GITHUB_TRANSFORM_1_whdr

This structured approach creates a clean data flow from ingestion through transformation, with no redundant data registration steps.

Scaling this approach for larger needs

While our example uses Github data, this pattern works well for various data sources:

Create dedicated landing projects for each major data source
Apply source-specific transformation rules, null handling, and naming conventions
Make these available to downstream transformation projects
Maintain clean lineage throughout your data pipeline

This approach scales well with more data sources and more complex transformations without losing performance or clarity.

Result: a more collaborative QTC environment

Qlik Talend Cloud's Cross Pipeline Projects feature represents an important improvement in how data engineers can collaborate. By eliminating the need to re-register data between projects, QTC now supports more efficient data architectures.

For organisations with multiple data teams, this feature enables better collaboration, clearer data lineage, and more manageable data pipelines.

Need help implementing Cross Pipeline Projects in your Qlik environment?

Book a call with our Qlik integration specialists to discuss how we can help you leverage this and other QTC features for your specific business needs.

View full post