How we learned to break down barriers to machine learning


SOURCE: ARSTECHNICA.COM
MAY 19, 2022

Welcome to the week after Ars Frontiers! This article is the first in a short series of pieces that will recap each of the day's talks for the benefit of those who weren't able to travel to DC for our first conference. We'll be running one of these every few days for the next couple of weeks, and each one will include an embedded video of the talk (along with a transcript).

For today's recap, we're going over our talk with Amazon Web Services tech evangelist Dr. Nashlie Sephus. Our discussion was titled "Breaking Barriers to Machine Learning."

What barriers?

Dr. Sephus came to AWS via a roundabout path, growing up in Mississippi before eventually joining a tech startup called Partpic. Partpic was an artificial intelligence and machine-learning (AI/ML) company with a neat premise: Users could take photographs of tooling and parts, and the Partpic app would algorithmically analyze the pictures, identify the part, and provide information on what the part was and where to buy more of it. Partpic was acquired by Amazon in 2016, and Dr. Sephus took her machine-learning skills to AWS.

When asked, she identified access as the biggest barrier to the greater use of AI/ML—in a lot of ways, it's another wrinkle in the old problem of the digital divide. A core component of being able to utilize most common AI/ML tools is having reliable and fast Internet access, and drawing on experience from her background, Dr. Sephus pointed out that a lack of access to technology in primary schools in poorer areas of the country sets kids on a path away from being able to use the kinds of tools we're talking about.

Furthermore, lack of early access leads to resistance to technology later in life. "You're talking about a concept that a lot of people think is pretty intimidating," she explained. "A lot of people are scared. They feel threatened by the technology."

Un-dividing things

One way of tackling the divide here, in addition to simply increasing access, is changing the way that technologists communicate about complex topics like AI/ML to regular folks. "I understand that, as technologists, a lot of times we just like to build cool stuff, right?" Dr. Sephus said. "We're not thinking about the longer-term impact, but that's why it's so important to have that diversity of thought at the table and those different perspectives."

Dr. Sephus said that AWS has been hiring sociologists and psychologists to join its tech teams to figure out ways to tackle the digital divide by meeting people where they are rather than forcing them to come to the technology.

Simply reframing complex AI/ML topics in terms of everyday actions can remove barriers. Dr. Sephus explained that one way of doing this is to point out that almost everyone has a cell phone, and when you're talking to your phone or using facial recognition to unlock it, or when you're getting recommendations for a movie or for the next song to listen to—these things are all examples of interacting with machine learning. Not everyone groks that, especially technological laypersons, and showing people that these things are driven by AI/ML can be revelatory.

"Meeting them where they are, showing them how these technologies affect them in their everyday lives, and having programming out there in a way that's very approachable—I think that's something we should focus on," she said.

What about companies?

Turning away from individual acceptance and access to AI/ML and looking briefly at how AI-averse companies and organizations might bend themselves in the direction of starting to utilize machine learning, Dr. Sephus explained that there's no silver bullet or quick answer. It's a slow process of—and please excuse the business-speak—getting buy-in from all the stakeholders.

Many companies get swept away by a tech demo or wily enterprise sales team, but Dr. Sephus said that the best types of AI/ML implementations occur where there's both "grassroots" acceptance from the tech side of a company and also support from the leadership side. As anyone who has worked in enterprise IT architecture can tell you, leadership buy-in can be a tricky thing with any new tech solution—so if you're in that position, the way to sell AI/ML to your bosses is to come correct. Have a cogent understanding of your organization's business practices and security issues and look at the potential risks of changing those processes to include AI/ML tools. It's the usual refrain: assess the issue, then ask yourself what unforeseen problems you have failed to foresee.

To complicate things, the AI/ML landscape changes depending on what size company we're talking about. Smaller companies might not need to deal with massive data sets, which means less expense and potentially simpler tools and processes. Larger companies have bigger issues—often both with the size of the data they have to deal with and with the complexity of tools.

But there's another lurking issue that complicates things: bias.

Dr. Nashlie Sephus, discussing access to machine learning at Ars Frontiers 2022.
Enlarge / Dr. Nashlie Sephus, discussing access to machine learning at Ars Frontiers 2022.

Avoiding bias

Ultimately, machine-learning data sets and tools are built by people, and people have blind spots. It's pretty typical in AI/ML for those blind spots to make inconvenient appearances—often in the results of AI/ML experiments. Care needs to be exercised at every step in the process to check for the introduction of biases and to excise them.

Dr. Sephus explained a bit about bias avoidance and how extensive an activity it has to be. The key takeaway: Teams must carefully think through their goals and ask questions about bad end states, then work backward to ensure those undesirable results don't happen. Data must be appropriately diverse (you don't want your facial recognition data set to be made of primarily light-skinned people, for example, because then you get this kind of result).

Another important factor, Dr. Sephus said, is not blindly using public data sets without understanding what's in them. "You'd be surprised [at] what we've seen," she said, "looking through some of the most commonly sold data sets out there, especially those dealing with humans."

Without an effort to avoid bias, AI/ML can lead to poor outcomes. Dr. Sephus pointed out that in the financial technology world, algorithmic-backed recommendations are often used to determine loan approval, and this can be problematic. Financial institutions are legally prohibited from explicitly considering race in lending decisions, yet minority applicants are more likely to have mortgage applications rejected than white applicants—even after controlling for factors like household income and the level of other debts.

Last year, two finance professors—Laura Blattner at Stanford and Scott Nelson at the University of Chicago—published research exploring how credit-scoring algorithms can disadvantage racial minorities. They obtained a large data set of credit decisions, along with data about applicants' credit scores and subsequent defaults. This allowed them to compare the credit performance of those who received loans against those who were rejected.

If someone was rejected for a mortgage and then later defaulted on another loan, that suggests the bank was right to reject them. The authors compared loan approval rates and default rates for white and minority applicants and found that banks were more likely to reject Black applicants who subsequently proved to be good credit risks.

In their study, Blattner and Nelson consider two possible explanations for this disparity. One is model bias, where credit-scoring agencies choose algorithms that are better suited for evaluating white applicants than Black ones. This could happen if the training data set has many more white applicants than Black ones, causing a badly designed training algorithm to optimize for predicting the creditworthiness of white applicants but then produce erratic results for Black applicants.

But the researchers ultimately concluded that algorithms were not to blame. They experimented with several alternative machine-learning algorithms and found that they all produced similar results, with lower predictive accuracy for minority and low-income borrowers.

Instead, Blattner and Nelson conclude that data bias explains the disparate treatment of white and minority applicants. One issue is what they call the "thin file" problem: Minority mortgage applicants tend to have had fewer credit cards, auto loans, or other credit products over the course of their lives. With less data to work with, credit-scoring algorithms are less confident in an applicant's creditworthiness and are therefore more likely to reject their application.

Of course, a minority applicant's "thin file" might happen through no fault of their own. Maybe discrimination prevented them from getting loans earlier in their lives. Maybe their parents couldn't afford to send them to college, leading to lower earnings. An algorithm doesn't have any of that context—it just sees a limited credit history and concludes that it represents a higher risk.

Dr. Sephus urged companies to think carefully about these kinds of data issues. "You shouldn't just take publicly available data sets [at] face value," she said. "There could be a lot of issues with the data."

Blowing away barriers

As we wrapped up, I asked Dr. Sephus to talk a bit about her nonprofit work in Jackson, Mississippi, where she founded a nonprofit organization called The Bean Path. "We've helped over 1,500 people in the last three-plus years gain exposure and access to technology," she explained, "things that should be household names, but oftentimes they aren't in certain communities."

The Bean Path needed space to scale up, so Dr. Sephus began acquiring land and buildings in downtown Jackson to build out something that has come to be known as the Jackson Tech District. The development project will open its doors later this summer with its first building, which will feature a makerspace where community members can use 3D printers, laser cutters, and other maker tools to build things. More buildings and projects will follow.

She was clearly excited about being able to attack the problem of access to technology. By introducing this and other technology to folks who wouldn't ordinarily have encountered it, you can create some sparks—and, for the right person, those sparks might turn into a brightly burning love for technology.

"It's all about how... you bridge that access to these technologies and this programming, learning about AI and NFTs and all these things—cloud computing for the average person," she said. "We're opening that this summer. We have a big summer camp we're starting. So the rest is—the sky's the limit from there."

Ars author Tim Lee contributed to this report.

Listing image by iStock / Getty Images Plus

LEE HUTCHINSONLee is the Senior Technology Editor at Ars and oversees gadget, automotive, IT, and gaming/culture content. He also knows stuff about enterprise storage, security, and human space flight. Lee is based in Houston, TX.EMAIL lee.hutchinson@arstechnica.com

Similar articles you can read