AUG 04, 2022
More Than Scalability: The Cloud Enables Smaller Businesses to Do Big Things with Data
DEC 02, 2021
Built with powerful AI, public clouds’ analytics and developer services make possible for SMBs what had long been the province of enterprises.
When Chris Patti says “weather is massive,” the chief data officer and CTO of AccuWeather is referring to data — specifically, the amount of data necessary to track and predict the weather all over the world, from big storms to hyperlocal forecasts.
“We’ve been Big Data since before Big Data was a thing,” Patti says.
AccuWeather is one of the best-known weather information services in the world, reaching more than 1.5 billion people globally through mobile devices, websites, radio, television and newspapers. However, the company, based in State College, Pa., employs fewer than 500 people. Only about 40 professionals work on the data science team.
Like millions of companies worldwide, AccuWeather has been gradually increasing its cloud use over the past decade. It moved its first workloads to Microsoft Azure in 2012, mainly to host and run its APIs, which today handle more than 40 billion requests per day. Over time, the company shifted more functions to Azure, including some artificial intelligence services, such as a natural language processing bot to answer consumer questions, like “Is it going to rain today?”
Most recently, AccuWeather’s data science team created a massive data lake of more than 3 petabytes inside of Azure to carry out machine learning in the cloud. The team uses Databricks, a Microsoft service within Azure, to prepare the data that gets fed into the machine learning models to produce clean, aggregated results.
“Machine learning can process so much so quickly,” says Patti. “It finds all kinds of correlations we didn’t see before. It can explain things like why it’s so windy in a very specific location. We’re seeing a revolution in meteorology.”
AccuWeather’s data scientists can also use the cloud-based data lake and Azure tools for exploratory projects, without added cost.
According to Chida Chidambaram, an expert in AI, machine learning and cloud at Deloitte, these opportunities to experiment can be a significant advantage for smaller companies looking to go to market rapidly.
“The cloud is very accessible,” says Chidambaram. “All you need is a credit card. The big providers have tools for companies to get set up quickly.”
The cloud’s low barrier to entry helps to explain its growing popularity with small and midsized businesses. According to the “2021 State of the Cloud Report” by Flexera, 69 percent of SMB workloads and 67 percent of all data will reside on a public cloud platform within a year.
For small and midsized businesses, just as important as the cloud’s scalability is the access it provides to sophisticated services, powered by AI and machine learning, that allow them to analyze and operationalize data in ways that have traditionally been possible only for enterprises. As Patti puts it, “the business value should outweigh the investment by about five times.”
Global Fishing Watch, a nonprofit that advances ocean governance, couldn’t meet its mission without the cloud. The Washington, D.C.-based organization uses cutting-edge technology to learn more about human activity at sea and its impact on the global seafood industry.
“We had huge data sets, including GPS tracking information for more than 300,000 vessels,” says Paul Woods, co-founder of and chief innovation officer for the nonprofit. “We knew we would need to use machine learning to make analyzing that data into a scalable process that we would be able to distribute to the world for free.”
Using Google Cloud Platform’s BigQuery analytics platform, along with Earth Engine, a tool that provides geospatial data from a wide range of sources, Global Fishing Watch allows university researchers to access data and experiment with it in the cloud.
Machine learning is necessary for data analysis because data can often be messy. For example, one data source the organization uses is from a well-established radio protocol for vessels to communicate with one another called the Automated Identification System.
“AIS transmissions themselves are not always reliable indicators,” Woods says. “For example, fishing vessels are associated with a specific AIS code, but sometimes a boat will use a different code. What we’ve done is train the machine learning models to recognize the typical pattern of a fishing vessel. We can tell based on the way it moves whether it’s dragging trawl nets or using a long line with hooks. The models can also detect when a vessel is having an encounter at sea, potentially transferring catch to another boat, or if it has disabled its AIS device. Putting all the information together helps us detect and report unregulated, unreported fishing.”
The cloud has allowed the small but technically savvy team to develop a highly effective offering for governments and fishing concerns alike. Recent research using ML and Global Fishing Watch’s data has been able to detect anomalies, which may be helpful in assessing the risk of issues such as forced labor on fishing vessels and false GPS transmissions.
For Woods, building and running data analysis programs in the cloud is his organization’s clearest path to success.
“The cloud gave us full enterprise scale from day one,” he says. “With the architecture that you get, we don’t have a DevOps team. Everyone on the team is focused on building stuff, not running stuff.”
The cloud helped Snapdocs solve its messy data problem too.
The behind-the-scenes provider for real estate closings compiles, digitizes and sorts documents from lenders, borrowers and title companies. Machine learning models built with Amazon Web Services’ SageMaker split hundreds of pages into subsets to get the correct documents to the right people so they can sign them in the right places.
“The cloud allows us to go to market in a more scalable way,” says Briana Ings, head of product at Snapdocs. Without AI, she says, “we would have to wait for every company in the industry to digitize itself. With AI, we don’t need integration. We can automate the process ourselves.”
A midsized business, Snapdocs has been able to maintain a lean operation by relying on cloud-based analytics.
“Without machine learning, we would have to employ hundreds more people to manually process these documents,” says Greg Romrell, head of data at Snapdocs. “When we first started using machine learning, we needed two to three people and about 30 minutes of processing time to finish a closing package. Now we need at most one person, and the finalization process takes less than 20 minutes.”
Natural Language Processing