MLOps as the key to unlocking the potential of AI


SOURCE: CALCALISTECH.COM
FEB 17, 2022

"While the promise of AI has made its way into our lives, its steep barriers and increasing requirements have made only the most technologically advanced organizations able to harness its true potential," write Shay Grinfeld and Itay Inbar of Greenfield Partners

Over the last decade, Artificial Intelligence has become an increasingly prevalent force in our everyday lives.

From consumer applications such as recommendations on Netflix and Spotify, to becoming a staple in the workplace with AI-based fraud detection, process automation, and cybersecurity. The near future indicates AI will further spread into every aspect of our lives. Its continued adoption and integration with new applications such as autonomous driving, healthcare, and others prompts IDC to project the global AI market to reach $550 billion by 2024.

This rapid growth, fueled by developments in deep learning, computer vision, and natural language processing, is continuously advancing through a combination of academic and Big Tech research groups such as Google, Facebook, AWS, OpenAI, among others. Thanks to the age of open-source, many of these advancements are available for public use.

Though promising, these developments in AI are not without limitations.

The Deployment Gap

While these collaborative open-source projects form the heart of the AI revolution, bringing AI into production is a complex, multi-step pipeline, each with its own challenges. From collecting and preparing data, experimentation and research, training and evaluation, to deployment and monitoring, each phase requires significant resources and expertise.

As noted in a recent survey: “Many companies haven’t figured out how to achieve their ML/AI goals, bridging the gap between ML model building and practical deployments is still a challenging task. There’s a fundamental difference between building a model in a notebook and deploying an ML model into a production system that generates business value.”

As such, an estimated~90% of ML models fail to make it to production.

Enter MLOps

As DevOps has significantly streamlined software development production, a new category of applications for improving the effectiveness of machine learning has risen—MLOps - which by de definition is the set of practices at the intersection of Machine Learning, DevOps and Data Engineering. MLOps enables companies to innovate and bring products to market faster with greater efficiency. Though the precise definition of what is included in MLOps (vis-à-vis the traditional data stack or DevOps) can be open to interpretation, the current landscape encompasses hundreds of unique startups and prominent open-source projects seeking to tackle these challenges.

MLOps Landscape - Credit: Greenfield Partners

MLOps Landscape - Credit: Greenfield Partners

The Israeli MLOps Landscape

As with nearly every facet of technological advancement, there are a wealth of innovative Israeli MLOps-focused startups driving the area, many of which raised an aggregate hundreds of millions of dollars across the different segments in the space:

Data Preparation - We’ve all heard the adage “data is the new oil,” which is very accurate in the context of AI. High quality data acts as the fuel for AI models; without it we receive a case of “garbage-in garbage-out.” Companies such as Monte Carlo and Databand provide reliability for data pipelines, ensuring quality data is consistently fed to the models, while open-source projects such as Treeverse’s LakeFS enable organizations to version their datasets that are shareable and reproducible across development teams. To increase model accuracy, Explorium, Datagen, and Datomize supplement an organizations’ existing data with external and synthetic data.

Model Development and Training – While most ML models are based on open-source projects at their core, companies must fine-tune them to their specific needs and production environments to drive optimal results. Experimentation platforms like Comet provide data scientists with solutions to document, collaborate, and analyze model outputs, while organizations such as Deci optimize models to run with greater accuracy and less runtime vis-à-vis a developer’s specific hardware.

Deployment Platforms – Commonplace to similar segmentations of technology, MLOps shares a best-of-suite vs. best-of-breed approach. Projects led by major cloud providers such as Google’s KubeFlow, Databricks’ MLFlow, and AWS’ Sagemaker are the leading one-stop-shop solutions, but fall short in offering complete feature-sets. Innovating in this space, startups like Iguazio and Qwak offer holistic platforms that enable companies to build, deploy, and monitor their ML models.

Monitoring – A segmentation with significant focus by Israeli startups, live production models require continuous monitoring and testing to identify drifts in precision and output. Several companies such as Aporia, Deepchecks, and Superwise ensure the integrity and efficiency of live models, continuously monitoring changes in underlying data or infrastructure downtime.

AutoML – Similar to the elucidation of data analysis and visualization that Tableau and PowerBI provided, AutoML seeks to expand the capabilities of machine learning beyond those of practicing data scientists. While broad enterprise AutoML platforms such as Datarobot and Dataiqu have grown in recent years, companies like Pecan, BeyondMinds, Noogata, and others are developing AutoML integrations into companies’ existing analytic workflows, providing powerful use-case and sector specific predictive powers.

Infrastructure – Model complexity and scale are rapidly increasing, necessitating faster, cheaper, and more efficient infrastructure. Many frameworks to date are built on combinations of GPUs and traditional storages, mediums ill-equipped for the task. The Israeli MLOps ecosystem has made significant leaps in this arena, with startups such as Habana and Hailo crafting new AI-dedicated chips for data centers, while organizations like Run:AI virtualize existing clusters of GPUs. VAST Data, a portfolio company of Greenfield Partners, and Weka materially increase storage speeds, optimizing data centers to handle the steep requirements of modern AI applications.

While the promise of AI has made its way into our lives, its steep barriers and increasing requirements have made only the most technologically advanced organizations able to harness its true potential. The entrance of MLOps, however, addresses these complexities, lending accessibility to ever-increasing cohorts seeking to leverage AI with less complexity and required expertise.

From left: Shay Grinfield & Itay Inbar - Grinfield Partners (Credit: Grinfield Partners)

From left: Shay Grinfield & Itay Inbar - Grinfield Partners (Credit: Grinfield Partners)

The article was written by Shay Grinfeld, Managing Partner, and Itay Inbar, Senior Associate, at Greenfield Partners.