imec

imec is the world’s leading independent nanoelectronics R&D hub and the home of more than 5,500 people. As data is core to good research, imec had the ambition to roll-out a broad range of data & analytics capabilities company-wide.

imec’s ambitions

Reaching this ambition, would put a lot of stress on the SAS-based traditional warehouse of Imec. The company was concerned about the scalability and sustainability of this SAS-based solution, and was looking for a modern, cloud-based solution that allowed to scale from a handful of data products to establishing structural insights and decisions through data across all business lines.

imec’s needs to reach this ambition

Imec was well aware that replacing tools was not sufficient to reach this ambition. It was open to rethink their way of working and define guiding principles and best practices for building and running data products.

To mature imec’s notebook-based platform and way of working to a state-of-the-art and robust data platform, and a well-governed federated approach to building data products, imec identified missing, yet required capabilities. These included more sophisticated scheduling and orchestration, container-based data processing, proper security and data access control, logging & monitoring, and more. Keeping costs under control at scale was also a very important driver for imec.

Imec was very well aware of the pains, but was looking for guidance in navigating potential solutions. We did this by making things as practical and tangible as possible via hands-on, technical demos. We showcased the benefits of centralized governance, development and unit testing templates in PySpark and DBT, infrastructure automation, Continuous Integration and Continuous Deployment (CICD), observability, and more.

How Conveyor fits in imec’s ambitions

As a guiding principle, imec decided to follow data mesh principles: organize use cases per domain (HR, Marketing, Sales, R&D…), treat data as a product owned by such a domain, leverage a self-service platform for the development and federated governance of data products.

As a cornerstone of imec’s self-service data platform, imec chose Conveyor, a product to guide data scientists and engineers through all stages of the data lifecycle, from experimentation to industrialization.

Next to Conveyor, imec's data platform is built on top of a few foundational Azure services such as Azure Data Lake Storage (ADLS), Azure Key Vault, and Azure App Service. The data lake is secured by Azure's Role-Based Access Control (RBAC) and integrates smoothly with the workload identity management capabilities of Conveyor. In addition to Conveyor's Spark runtime for streaming and batch processing, Azure Synapse is used for ad-hoc querying of the data lake, and exposing the data to PowerBI for reporting and visualization.

Imec aims to support all data workers with different data maturity levels. People can now interact with data through Dash apps or PowerBI reports, Notebooks, or even Matlab and JMP through API’s and SQL endpoints. Setting up this entire data platform was a time consuming task, but Conveyor as a large enable was operational in just one month.

Conveyor’s impact at imec

Within a year after the installation of Conveyor, more than 30 use cases across 10 different domains have seen the light of day, going from ideation to production. The amount of data products has rapidly grown beyond 70. These data products concern their wafer production process, R&D activities, marketing activities, and more. Costs have been reduced significantly.

Developers can focus mainly on writing business logic, as Conveyor assists in quickly creating and deploying new use-cases. These deployments can be triggered via the version control system.

Moving forward, imec will continue to build a self-service data retrieval tool for imec's R&D department. In time, this tool will empower over 1,000 engineers to efficiently retrieve and explore data from their experiments, freeing up their time to focus on imec's core business.

Several new engineers have been onboarded to the platform and are empowered with the self-service capabilities for rapidly experimenting and deploying new projects. The growing number of data products and synergies made available by combining insights from different domains presents imec with opportunities for expanding their analytics efforts.

Highlights of imec’s success story

✓ Imec introduced a self-service platform for the development and federated governance of data products with Conveyor as a cornerstone.
✓ Conveyor and its necessary Azure resources were operational in just one month.
✓ Within a year after the installation of Conveyor, more than 30 use cases across 10 different domains have seen the light of day.
✓ Conveyor is the core of a platform that will empower over 1,000 engineers to efficiently retrieve and explore data from their experiments, freeing up their time to focus on imec's core business.

Learn from imec's journey beyond DIY platforms. Watch this expert panel talkk >>

imec’s success story highlights their transition from a traditional SAS-based warehouse to a scalable, cloud-based solution. In just one month imec was able to unleash over 30 use cases across 10 diverse domains, transforming their data capabilities. But beyond the success lies a cautionary tale – the pitfalls of attempting to build your own data platform.