Over the past decade, businesses have witnessed a substantial increase in both the number and variety of data tools. This surge primarily results from organizations transitioning from on-premise to cloud solutions. The concept of the “modern data stack” (MDS) has expanded, promising systems that are more flexible and reliable. Companies now have access to tools that support data warehousing, ETL processes, real-time analytics, and machine learning, providing a comprehensive suite of solutions to handle diverse data needs. For instance, in 2021 alone, the global data warehouse market size reached $21 billion, indicating the scale and importance of these tools in modern business operations.

Big data (pre-2013)

In the big data period, businesses operated under the premise that more data equated to more value. They invested heavily in technologies to process and analyze vast amounts of data, seeking insights that could transform their operations. Despite these efforts, they often found that storing and managing large volumes of data was far simpler than extracting actionable insights. For instance, a typical enterprise in the early 2010s might handle petabytes of data but struggle to find meaningful patterns without advanced analytics tools.

Modern data stack (post-2013)

Following the big data period, companies recognized the need to update their data infrastructures, leading to a proliferation of new data tools. Vendors flooded the market, each claiming their solution was key to unlocking business insights. The modern data stack era saw the rise of platforms that could integrate multiple data sources and types, offering advanced analytics and visualization tools. For example, the adoption of platforms like Snowflake and Databricks illustrates how businesses have embraced these integrated tools to improve decision-making processes.

Challenges with data tool overabundance

With the surge in data tooling came new challenges, particularly regarding system complexity. Businesses faced difficulties integrating various disparate tools, leading to a fragmented data ecosystem. The complexity of managing multiple tools increased operational costs and extended the time to insights. For example, multiple surveys revealed that nearly 47% of businesses report that data silos and integration issues significantly hamper their analytics capabilities.

Major corporations have since made significant investments in data infrastructure, often without a clear strategy for extracting value. This approach led to disproportionately high costs with minimal return on investment. It was common to find different teams within the same organization using overlapping tools for similar purposes, tripling costs without tripling benefits. Large enterprises can waste up to 35% of their technology budgets on redundant tools and underutilized solutions.

Despite existing challenges in the modern data stack, interest in AI has expanded, leading to a new wave of data tooling. This transition began even before the full market consolidation of MDS, illustrating the industry’s rapid shift toward AI capabilities. Enterprises are increasingly adopting AI-driven tools to refine their data strategies and improve operational efficiency. 

Differences in the AI stack

AI tools differ fundamentally from previous data technologies as they primarily handle massive volumes of unstructured data. These tools use generative models that are non-deterministic, meaning their outputs can vary with the same input, challenging traditional testing and evaluation paradigms. The need for new frameworks to test, evaluate, and monitor AI systems is evident, as these models adapt and learn in ways that structured, deterministic models do not. For instance, generative AI models like GPT-3 have demonstrated remarkable versatility but also unpredictability, necessitating comprehensive governance and ethical monitoring frameworks to manage their use in business applications.

Areas for further exploration

The expanding AI stack opens numerous avenues for innovation and optimization, especially in complex business environments. One notable area is agent orchestration, which involves multiple AI models communicating and cooperating to complete tasks. In industries such as logistics and supply chain management, agent orchestration can optimize routes and schedules in real-time, responding to changing conditions without human intervention.

Purpose-built models for specific industries are another growth area. These models are tailored to unique sector needs, offering more precise predictions and insights than general models. For example, in healthcare, purpose-built AI can predict patient risks and outcomes based on personalized data, potentially improving care and reducing costs. Financial services use similar models to assess credit risk or detect fraudulent transactions with greater accuracy.

Workflow tools for fine-tuning datasets with private data to create customized models are also emerging as a key opportunity. These tools let companies make best use of their proprietary data to refine AI models without exposing sensitive information. Such capabilities are essential for maintaining privacy and compliance, especially under regulations like GDPR. 

Building smarter in AI

Learning from the past

Businesses today must remember the lessons from previous data tooling excesses as they transition to the AI-driven era. In the past, many organizations rushed to adopt new technologies without a clear understanding of their potential benefits, leading to wasted resources and underwhelming results. During the big data boom, companies accumulated vast amounts of data but often lacked the capability or strategy to extract meaningful insights, highlighting the gap between data collection and value creation.

Learning from these experiences, businesses must approach AI with a strategic mindset, focusing on applications that directly contribute to their strategic goals. They need to invest in AI not just as a technological upgrade but as a part of a comprehensive business strategy that includes training, governance, and ethical considerations.

Strategic focus on value generation

Enterprises today are learning that they need to develop a clear understanding of the specific value that data and AI tools bring to their operations. Leaders are finding that a focused approach on tools that demonstrate clear Return on Investment (ROI) leads to better resource allocation and more impactful outcomes.

An AI tool that reduces customer churn by accurately predicting at-risk customers can provide a clear path to increased revenue and improved customer satisfaction. Similarly, AI-driven supply chain tools that reduce delivery times and costs can directly increase profitability. Businesses are thus encouraged to quantify the expected benefits of AI initiatives in terms of operational efficiency, cost savings, and revenue generation before making significant investments.

Advice for founders and investors

Founders today face the challenge of differentiating their offerings in a crowded market. They need to assess whether they have a unique perspective or capability that addresses a gap in the current market before launching new tools. The proliferation of “me too” tools — products that offer no distinct advantages over existing solutions — can dilute brand value and lead to market confusion.

Founders should ask themselves if they are the right team to tackle the problem they’ve identified and whether their solution offers a noticeable improvement over existing tools. If a founder’s assessment does not yield a confident “yes,” they should reconsider their strategy. Developing a tool just because it seems profitable or because venture capital is available can lead to products that fail to meet market needs.

Investors must work through a complex environment where not all investments in the data and AI tooling stack will yield returns. They should move beyond superficial metrics like founder pedigree and instead analyze where value accrues in the tooling stack.

They should look for companies that address underserved needs or that can integrate with existing systems to provide clear value. Before investing, they should evaluate the competition, the technological comprehensiveness of the solution, and the team’s ability to execute their vision. An investment in a company that offers a nuanced and necessary improvement in data handling or AI application is more likely to succeed than one in a company whose value proposition is unclear or redundant.

Final points

The changes of data and AI tooling over the past decade has influenced how businesses approach, process, and derive value from data. With the expansion of the cloud and the proliferation of data tools, companies have navigated from the era of big data to an age where the modern data stack and AI-driven solutions dominate. 

Despite the challenges of integration, complexity, and cost inefficiency, opportunities abound in the new AI stack, particularly in agent orchestration, industry-specific models, and workflow tools for dataset fine-tuning. As organizations continue to build smarter AI, they must learn from past tooling excesses and focus on strategic value generation.

Founders and investors alike need to critically assess the market and avoid redundant tools, while businesses must establish frameworks to quantify the precise value of their data initiatives. When doing so, they can make sure their investment in data and AI supports and drives their strategic business objectives.

Alexander Procter

May 27, 2024

7 Min