MIT and Databricks Report Finds Data Management Key to Scaling AI

SEP 20, 2022

A new report from MIT Technology Review, in association with Databricks, found that 72% of C-level respondents believe data mismanagement will jeopardize future AI success.

The report, “CIO vision 2025: Bridging the gap between BI and AI,” is based on a survey of 600 global CIOs, CDOs, and CTOs from 14 industries conducted in May and June 2022. According to Databricks, the purpose of the report is to understand how leaders are thinking about challenges in data management and business value realization as they work to unleash the power of AI in their enterprises.

Key findings include how 78% of surveyed executives say scaling AI successfully is a top priority for their data strategies and over half expect AI use to be widespread or critical in IT, finance, product development, marketing, sales, and other business functions by 2025 with 94% indicating they have already adopted AI in their organization. A majority of companies say they will invest in unifying their data analytics and AI platforms in the next three years and 99% of leaders believe this will be crucial for the success of their overall data strategy.

Scaling AI involves improving data management, including data processing speeds, governance, and quality. When asked which aspects of their company’s data strategy need the most improvement, 35% of respondents pinpointed slow data processing speeds, and 25% named a lack of sufficient data to feed AI and ML models. Access to and integration of external data was also a concern for 26%.

These are the tangible benefits of AI listed by respondents both for today and the future. Source: Databricks/MIT Technology Review

“Data issues are more likely than not to be the reason if companies fail to achieve their AI goals, according to more than two-thirds of the technology executives we surveyed,” says Francesca Fanshawe, editorial director for MIT Technology Review and editor of the report. “Improving processing speeds, governance, and quality of data, as well as its sufficiency for models, are the main data imperatives to ensure AI can be scaled.”

Data security is also a priority with leaders revealing they plan to increase spending on security improvement by an average of 101% over the next three years. The leader group also plans to invest 85% more in the same period on data governance, 69% more on new data and AI platforms, and 63% more on existing platforms.

The report lists a few attributes of successful data and AI technology foundations, including a democratization of data to involve a greater number of data literate employees who can configure and improve AI algorithms. Openness is another attribute, with open standards and data formats allowing organizations to source data, insights, and tools externally to facilitate collaboration. Third, a multi-cloud approach can give access to faster and more powerful data processing but involves data management complexity, and technology foundations should include platforms with centralized capabilities such as MLOps.

The report concludes that for many organizations, the journey to becoming AI-driven has just begun: “CIOs recognize that their organizations have thus far only scratched the surface of the efficiency, speed, innovation, and other gains that the use of AI and machine learning can generate across different functions. They also recognize that the data, talent, and other foundations they are putting in place to support AI development cannot remain static,” the report states. “The foundations must evolve not just to enable the critical scale of use cases to be reached, but also to keep pace with future advances in the science of AI and the demands they may pose for additional power, expertise, and process change.”

These are the impediments to achieving AI goals, cited by survey respondents. Source: Databricks/MIT Technology Review

Databricks says the challenge of becoming AI-driven starts with data architecture that is equipped to handle workloads for business analytics, data engineering, data streaming, and machine learning. The company says a unified platform, such as a data lakehouse, can provide flexible, high-performance analytics, data science, and ML by combining the performance, reliability, and governance of data warehouses with the scalability, low cost, and workload flexibility of the data lake.

“These insights from global CIOs are consistent with what we hear in the field. AI-ready data is no longer a nice-to-have — it is critical to solve real-world problems and drive business outcomes,” says Chris D’Agostino, Global Field CTO at Databricks. “An open and unified platform like the Databricks Lakehouse enables organizations to put their data into action and we are committed to ongoing innovations that will empower business leaders to deploy and scale mission-critical AI projects successfully.”