Teaching computers lelo Hawaii prompts debate on data sovereignty


SOURCE: HAWAIIPUBLICRADIO.ORG
SEP 15, 2021

Efforts are underway to teach computers to understand ??lelo Hawai?i (Hawaiian language). Using artificial intelligence technology could be a game changer in advancing the use of Hawaiian language. But some worry about tech companies and control – an area of concern they call “data sovereignty.”

Hawaiian language researchers have compiled around 400 audio recordings of m?naleo or native speakers from the 1970s and 80s. But transcribing those recordings has been a labor intensive task, says ??iwi Parker Jones, a research fellow focusing on artificial intelligence at Oxford University.

“He hana nui k?l? ?e?? Ke kikokiko ?ana, ho?olohe, he mau hola kanaka no ka ho?om?kaukau. A in? hiki ke hana ke kamepiula. Kohu mea k?kau ke kamepiula i kekahi k?mua. ?A?ole paha pololei loa, ak? ?o ka maika?i pololei loa. A he k?kua k?l? i ka ho?owikiwiki.”

Parker Jones says if we can automate that transcription process, the computer can generate a rough draft. It won’t be completely accurate, but it would speed up the process. He’s working on voice-to-text or speech recognition technology to enhance access to these recordings.

He says the challenge when it comes to indigenous languages like Hawaiian, which are spoken by a relatively smaller population than say English, is that developing that kind of technology requires tens of thousands of hours of transcribed audio, if not more.

“Pono he mau hola o ka leo i k?kau mua ?ia. ?O ka leo a me ke kikokiko, mau h?neli hola paha i makemake ?ia, a loa?a kekahi mea maika?i. Ma ka ??lelo Pelek?ne ho?ohana ?ia ma ?? aku o k?l? h?neli kaukani hola paha.”

Parker Jones says he could create a program with several hundred hours of audio, but he estimates similar English programs use at least hundreds of thousands of hours of audio.

Keoni Mahelona, Chief Technology Officer for Te Hiku Media in Aotearoa (New Zealand), has been a vocal proponent of indigenous communities taking advantage of AI and machine learning technology.

“You can do all sorts of crazy stuff, natural language processing, you can look for words that maybe don?t show up often, idiomatic expressions, you can do automatic parts of speech tagging,” says Mahelona, “There’s just boundless opportunities.”

Career opportunities, economic opportunities, and opportunities to connect with what it means to be Hawaiian or M?ori in the case of Te Hiku.

“What we?re trying to do is enhance access to te reo maori, te ao maori – the knowledge that’s embedded in the ??lelo,” says Mahelona, “The ??lelo itself was quite different. What we?re trying to do is to decolonize te reo maori, decolonize the sound, decolonize the actual language, and decolonize the digital space.”

Te Hiku Media has positioned itself as a leader in indigenous AI, developing the first automatic speech recognition technology for te reo M?ori, the indigenous language of Aotearoa.

One of the biggest concerns moving forward is that indigenous communities whether M?ori, Hawaiian or otherwise maintain control over the data involved in creating these programs, and that the benefits flow back to the community. A concept known as data sovereignty.

“?A?ole k?kou makemake e h??awi aku a piha i ka pakeke o l?kou ala. Mamake mua k?kou i kekahi o k?l?. Makemake k?kou e hai ?ia ka po?e Hawai?i no k?l? hana a loa?a ke k?l? i ko k?kou kaiaulu.”

Parker Jones says he doesn’t want to just give this data away so tech companies can profit. He wants Hawaiians to be hired to run these programs, ensuring profits return to the community.

Similar articles you can read