Data Science: From Black Box to Glass Box

JUN 12, 2022

Digital solutions have transformed the way utilities operate, empowering them to better understand their assets and in turn, make better, more informed decisions. While data and analytical techniques have become common tools in asset management, making sense of large pools of data and using this information to make the best decisions can be a complex process.

That’s where data science comes in. As a unique way of pulling traditional statistics together in a self-recognizing pattern, data science can be an extremely powerful tool in a utility’s solution set. However, it can also be extremely dangerous if you don’t have the right kind of data. Data from connected digital solutions generate all kinds of patterns that reveal all kinds of interesting things, but if that data is not right, it’s really not telling you anything.

As data scientists, our job to find the value in data – looking beyond the surface level figures to find the information needed. We then help our customers use this information to their advantage. To successfully unearth this value, it’s important to first employ the most effective method of data science. These methods have evolved over time, from failure records to condition assessment.

Failure records, a primary data source used in asset management, focuses on using past failures to predict future failure rates. Condition assessment data goes one step further, providing an estimation of the current state of an asset, as opposed to the state of an asset at the point of failure. Using condition data, we can estimate degradation rates and accurately predict future degradation rates which can be used to identify the right intervention – before failures occur.

When it comes to managing assets, the goal should always be to project failures, rather than wait for failure. By taking an analytical approach using condition assessment data, we can understand the rate at which an asset reached its current state and from there we can better estimate when failure is likely to occur. So, not only can we identify if intervention will be needed, we can now also estimate approximately when that intervention will be needed.

Think of it this way. If you compare interpreting data to studying an elephant’s footsteps - the footprints are like data, not revealing much information beyond the fact that what you’re looking at is big and heavy. It’s simply the remnants of what was. Condition assessment allows us to dig much deeper and obtain better context and insight regarding the shape and size of the elephant, ultimately providing more value.

Data science is one of many tools in a utility’s toolbox – but there are many other things that need to be considered when it comes to maximizing its potential. It takes a multi-disciplinary team with a diverse range of skill sets to interpret data. Our team combines civil engineering and data tools like machine learning to maximize problem solving solutions. We help clients pinpoint their problems and their desired outcomes. We then work with them to identify existing data available and plug any data gaps, assessing how we can transform that data into insight.

Armed with this valuable context and insights, we have to power to apply these learnings to deliver transformative outcomes for clients – benefits which are ultimately passed on to their customers.

How utilities are using data science

Municipal water systems supply critical services to millions of homes and businesses, however, much of the infrastructure these systems rely on was installed in the mid-20th century. Operating and maintaining these aging systems is therefore extremely challenging, and having an effective asset management plan is critical.

One of the most common and effective uses of data science employed by utilities is risk assessments – reviewing data to identify weaknesses and develop action plans that mitigate them. Current risk assessments aim to identify and resolve issues with specific individual assets. However, planning for the replacement of individual assets to minimize repeated disruption is not the most efficient approach, particularly when high-risk assets are scattered across the distribution network.

By considering the spatial location of assets and grouping high risk assets into clusters, utilities can target a group of clusters instead of individual pipelines, minimizing set-up costs and disruptions to service. Using these high-risk pipe clus­ters to guide capital planning enables utility managers to better protect the health of their critical underground systems.

This is the approach taken with the City of Raleigh Public Utilities Department. The utility maintains the water and wastewater infrastructure for some 600,000 customers in Raleigh, North Carolina, and six surrounding communities. With the city’s drinking water distribution system dating back to 1887, the client needed to prioritize capital works using analysis to determine pipeline risk and justify capital investment.

Working together, we conducted a probability of failure analysis using historical data in their GIS, and leveraged Xylem’s Asset Performance Optimization solution to identify contiguous clusters of high-risk individual pipes – optimizing their selection for maximum risk reduction.

Identifying high risk clusters enabled the client to prioritize pipe replacement projects, reduce mobilization expenses, minimize repeated disruptions and reduce capital planning time by 75 percent. We also helped develop a more efficient asset management program, ensuring the client is empowered to efficiently operate and maintain its system going forward.

In California, our team delivered a similar water network risk assessment plan for the City of Long Beach Water Department. After developing a risk model of each of the client’s pipelines, we developed a robust five-year plan outlining how they could assess the assets going forward in order to address and reduce risk.

As a result, the utility moved from a reactive assess and address approach to a proactive assess and address approach, mitigating risks in a timely manner and reducing disruption. Interaction with the client and clearly explaining how to leverage the data extracted from the condition assessment program was critical to the plan’s success.

From black box to glass box

Powerful results like this come from integrated partnerships where data scientists actively collaborate with clients, working together to find solutions to complex problems.

It’s not a case of delivering graphs, charts and statistics to clients. As data scientists, we need to explain what the information means and how clients can effectively apply it in practice. The end goal is to always make the output useful for the client.

Accessibility, transparency and a willingness and answer technical questions helps build trust in solutions, giving clients a full picture of the how and the why behind insights and the recommended actions.

Knowledge is power and we’re passionate about helping shine a light on solutions that can help our customers reach their full potential. Bringing our customers on the journey, from analysis to action, from black box to glass box – ensuring no one is left in the dark.