“Data Is Not Just Numbers – It’s Fuel for Ideas”: Digital Transformation Expert Denis Prilepskiy on How to Implement Data Strategies for Business Success

Interviewee: Denis Prilepskiy is a Digital Transformation Executive with extensive experience in leading digital and IT strategy, enterprise architecture, and IT organizational optimization. He has worked with top firms like McKinsey, Accenture, and PwC, helping organizations drive innovation and navigate the digital era. Denis holds multiple certifications, including TOGAF 9.2, PRINCE2, PSM II, PSPO I, FinOps, and ITIL 4, and is recognized for his ability to implement forward-thinking solutions that foster growth and competitive advantage.

With your extensive experience working in organizations of various formats and scales – from large banks to small startups – how has your understanding of the role of data in business evolved along the way?

Early in my career, working on a project at a large bank, I observed that data was mostly seen as a supporting resource for reporting and regulatory compliance. The architecture was fragmented: each department collected its own data, and the company lacked a comprehensive analytics roadmap. Whenever a new report or metric was needed, teams had to spend a long time searching for the necessary sources and manually integrating them or doing so through partial scenarios. It was then that I realized how labor-intensive even finding basic information can be in large organizations.

In contrast, the situation at a startup was quite different: although there was less data, it was used much more flexibly and rapidly. A small team could gather data from various APIs within a day or two and test a new hypothesis. Data was perceived not just as numbers but as “fuel for ideas” – quickly generating prototypes of services and models. It was in this agile environment that I truly felt the value of short iterations: as soon as we added a new ML module, the results were immediately tested on real users. This clearly showed me how data can accelerate innovation when handled quickly and effortlessly.

Over time, my understanding of the role of data in business evolved: I came to see data as a strategic asset for a company. In the modern corporate context, data forms the foundation of digital products, helps personalize services, manage risks, and optimize processes. This is especially evident in fintech projects: high-quality data about customers, transactions, and operations enables fast training of models for credit scoring, fraud detection, or support automation. Now, I view data as the “core” of the enterprise architecture, around which analytical and AI platforms are built, not just a random set of reports.

What do you understand today about the concept of an “AI data strategy”? Why is it not enough to just have “a lot of data”?

By AI data strategy, I mean a comprehensive approach where data is collected, stored, and processed not merely for accumulation, but in the context of specific AI/ML business goals. This means that business objectives determine which data is needed, how it should be prepared, and which technologies and teams are responsible for its maintenance. For example, if the goal is to automate loan issuance, the strategy will specify which customer parameters are important, what transaction context is required, and how to ensure data quality.

Such a strategy includes the architecture for data storage and processing, quality management processes, metadata, and alignment with business metrics. In other words, it is a clear plan for transforming raw data into valuable artifacts: features for models, analytical metrics, and reports. Without such a strategy, what often results is a “data swamp”: a multitude of tables, files, and unfiltered streams that are poorly structured and provide little business value. I have seen companies that thought simply accumulating gigabytes of logs was enough, but without understanding how to use them, this only led to growing storage costs and chaos in integrations.

Simply having “a lot of data” is not a solution. If data lacks a clear usage scheme, it can turn out to be irrelevant or redundant. Many believe that by accumulating a billion transaction records, they will solve any problem; but without well-thought-out management processes, quality monitoring, and understanding exactly how AI models need this data, volume leads more to technical debt than competitive advantage. That is why an AI data strategy requires not only technology (storage, processing, ML platforms) but also organizational efforts: specialists, processes, and a culture that teaches teams to see the value in data and work with it systematically. Such a strategy ensures that data is not just collected, but transformed into a reliable foundation for intelligent decision-making.

What role does a corporate feature store play, and when does it stop being just a nice-to-have and become a real necessity?

A corporate feature store is a platform or service that centralizes and manages features used in ML models. Its primary role is to ensure the reuse and consistency of these features across different teams and environments (both training and online inference). Essentially, it is a repository of “ready-to-use” features: once a feature is developed, it can be connected to any number of models without re-implementation.

At the early stages of machine learning adoption, many companies manage without a feature store: models are trained directly on raw data or features generated ad-hoc in code. But as the number of products and models grows, duplication of work arises: different teams create similar features independently, a unified understanding of their formulas and quality is lost, and moving from prototype to production becomes increasingly difficult. It is at this maturity stage – when an organization has dozens of models and several Data Science teams – that a feature store ceases to be a “toy” and becomes a necessity. It accelerates the time to deploy new models by ensuring the same feature calculations are used everywhere.

In one large bank where I was involved in AI projects for scoring and marketing, the feature store indeed became a key component. Instead of recreating features each time (for example, average monthly account turnover or debt-to-income ratio), we moved them into a shared service. This reduced model deployment time and guaranteed that inference used the same formulas and data as training. Moreover, it simplified feature monitoring and updating: changing the calculation logic in one place immediately reflected across all models. Thus, the corporate feature store turned out to be not just a technological experiment but a working tool on the path to mature ML infrastructure.

What are the most common mistakes you see companies make when trying to centralize features for ML models?

The main mistake when creating a centralized feature store is thinking that technology alone will solve the problem. Often, companies rush into implementing a feature store without establishing clear business cases and rules for its use. As a result, the shared pool ends up with hundreds of low-value features, and ML specialists simply continue generating their own “features in code,” bypassing the common repository. In other words, without understanding why and how teams will use the features, the centralized service risks becoming an abandoned catalog.

Another typical issue is overly rigid centralization at the expense of flexibility. Developers sometimes fail to consider that different models may require similar but slightly different features, and runtime execution can differ from training. If the feature store imposes a pattern that is inconvenient or slow, teams find workarounds. I have often seen ML engineers duplicate features directly in pipelines for optimal performance or add their own extensions to avoid being constrained by the platform.

Finally, the organizational factor is often underestimated. Without a shared culture and documentation around a unified feature store, teams find it difficult to trust and use it. Some companies forget that employees need training on working with the new service and that processes for updating and validating features must be formalized. Without this, the feature store sooner or later turns into a “dead” catalog rather than a useful tool. The success of a centralized feature store depends not only on software implementation but also on thoughtful collaboration between the teams that create and consume it.

What are the most common mistakes you see companies make when trying to centralize features for ML models?

The main mistake when creating a centralized feature store is thinking that technology alone will solve the problem. Often, companies rush into implementing a feature store without establishing clear business cases and rules for its use. As a result, the shared pool ends up with hundreds of low-value features, and ML specialists simply continue generating their own “features in code,” bypassing the common repository. In other words, without understanding why and how teams will use the features, the centralized service risks becoming an abandoned catalog.

Another typical issue is overly rigid centralization at the expense of flexibility. Developers sometimes fail to consider that different models may require similar but slightly different features, and runtime execution can differ from training. If the feature store imposes a pattern that is inconvenient or slow, teams find workarounds. I have often seen ML engineers duplicate features directly in pipelines for optimal performance or add their own extensions to avoid being constrained by the platform.

Finally, the organizational factor is often underestimated. Without a shared culture and documentation around a unified feature store, teams find it difficult to trust and use it. Some companies forget that employees need training on working with the new service and that processes for updating and validating features must be formalized. Without this, the feature store sooner or later turns into a “dead” catalog rather than a useful tool. The success of a centralized feature store depends not only on software implementation but also on thoughtful collaboration between the teams that create and consume it.

Is Data Mesh just a trend, or is it a practical approach? Have you worked on implementing it? What are the most common challenges you’ve seen?

Data Mesh has been actively discussed as a concept for several years now, but different organizations approach it in different ways. On one hand, it embodies real ideas: shifting responsibility for data to the same teams that create the business product (domain teams), rather than keeping everything within a centralized IT department. On the other hand, many perceive Data Mesh simply as a buzzword and confuse it with decentralization without considering the details. In my experience, there have been projects where a “transition to Data Mesh” was declared, but in practice, this often ended in confusion: the unified data team disappeared, but clear interaction processes did not emerge.

I have encountered implementations of Data Mesh elements in large companies that tried to distribute responsibility for analytical “data products” across domains (for example, payments, loans, marketing, etc.). The most common problem is underestimating the importance of a shared culture and standards. If each team begins building its own pipelines and data formats according to its preferences without a unified schema, the benefits of decentralization quickly diminish: data still ends up fragmented, integration between domains becomes very difficult, and the reuse of information drops. Because of this, many Data Mesh pilot projects rely heavily on a centralized platform: it must set some common rules and help synchronize approaches.

Thus, Data Mesh becomes a practical approach only if there are clear rules and support from a centralized team. Without a common data catalog, data contracts, and governance, the idea creates more chaos than benefit. Often, companies start with small experiments, learn from mistakes, and gradually build a platform for cross-domain data exchange. But today, I would call Data Mesh more of an evolution of the “data as a product” paradigm rather than a magic solution to all data problems. It is a powerful toolset, but without a data ownership culture and thoughtful architecture, it won’t work.

How do companies need to change organizationally for Data Mesh to work? Is it about culture, processes, or architecture?

From my experience, Data Mesh starts with organizational changes: it is primarily about data ownership culture and new processes. Technology is certainly important, but it comes second. First, you need to define which teams will become the “owners” of data domains, appoint responsible individuals (data product owners), and train them to work with data as a product. Without clear accountability for data quality and delivery, the approach simply won’t work.

The culture of data ownership must permeate all levels, from analysts and engineers to product managers. Processes should include collaborative development of data contracts, unified security standards, and mechanisms for shared usage. Only after the company builds a common understanding that metadata, dictionaries, and reference data are described uniformly does it make sense to roll out a Data Mesh architecture. It is the architecture (a unified platform with API access, data catalogs, and monitoring systems) that helps bring these processes to life and enables teams to self-serve.

So, although Data Mesh is often described primarily as an architectural pattern, in practice, it is more about organization and culture. In companies where engineers and analysts are already used to working closely with the business and placing data at the center, transitioning to a “data as a product” model is much easier. The key is leadership support and structural changes: you need to create incentives for collaboration and unified principles. The technical platform is just a tool that enforces these processes. If you remain only at the IT solution level without changing culture and processes, Data Mesh is doomed to inefficiency.

What currently inspires you in working with AI and data? Do you have a “professional challenge” that you haven’t solved yet but would like to?

I’m inspired by the current explosion of AI possibilities: the emergence of large models, generative systems, and AutoML tools is changing what can be created with data. Undoubtedly, as someone who has designed complex corporate architectures for many years, I’m fascinated by the speed of change: what yesterday seemed like an experiment is turning into a product today. It’s especially interesting to see how AI integrates into real business applications: for example, automated decision-making in lending or chatbots in customer support. It feels like the boundaries between research and production are disappearing faster than ever.

In working with data, I’m also inspired by the pursuit of reliability and scalability of solutions. Building platforms that allow rapid experimentation without sacrificing quality and compliance is a very engaging task. I love seeing how properly configured pipelines transform “raw” transactional records into actionable insights while maintaining security and transparency. This is especially important in fintech, where laws and data protection requirements constantly evolve. Being able to adapt the system to new regulations without halting business processes – that’s where real engineering interest lies for me.

As for professional challenges, I’m currently especially drawn to questions of more complete automation and self-sufficiency of ML platforms. I want to find a balance between centralized expertise and self-service for product teams, so they can independently fetch the data they need and deploy models without constantly relying on a central data office. Additionally, the challenge of ensuring ethical AI use and data privacy remains unresolved. This is especially crucial in banking, where speeding up development must go hand in hand with strict regulatory compliance. I would call this complex set of issues my current professional challenge, which I’m happy to continue working on.

Summary:

Denis Prilepskiy discusses the importance of a well-defined AI data strategy, highlighting the role of data in driving business innovation. He emphasizes the need for organizational culture and processes to support data initiatives like feature stores and Data Mesh. Prilepskiy also touches on challenges related to automation, self-sufficiency, and ethical AI use in regulated industries.

Suggested articles:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top