Voice-Based Payment Systems: are they the future of financial inclusion and SME finance?


Voice-based payment systems, driven by advancements in artificial intelligence and natural language processing, are rapidly entering the financial ecosystem and being mainstreamed. I share in a series of articles based on my direct experiences of being very closely associated with a global FinTech team that is involved in developing a voice-led payment system (that has undergone testing in multiple contexts), what implications these systems have for enhancing the process and digital literacy of excluded and low-income clients as well Micro, Small and Medium enterprises.

As with any disruptive innovation, they bring forth a set of regulatory and supervisory implications which we have been grappling with, and it is by no means over. While we can certainly see light at the end of the tunnel, crucial issues have to be addressed with the technology and its regulatory/supervisory implications; and I share these insights as well in this series.

Yet I am hugely optimistic that in the near-future (by mid-2024), voice-led payment systems will be fully mainstreamed (and not be a niche product). And when they get mainstreamed, the financial services sector will undergo quantum change in the words of Miller and Friesen (1984). The implications of this for financial inclusion, SME finance and UN SDG 1.4 – which envisages 100% sustained financial inclusion for all by 2030 – are phenomenal, as the following discussion will reveal.

            Benefits of a Voice-Led Payment System

The impacts of a voice-based AI payment system in enhancing process and digital literacy are as follows:

  1. Accessibility

For the Differently-Abled: Current financial systems often don’t cater sufficiently to those with disabilities. A voice-based system offers an alternative, especially for:

Visually Impaired: The system can provide real-time information and feedback, eliminating the need for screen-readers or braille.

Motor Disabilities: Those unable to use a keyboard or touchscreen effectively can command the system with just their voice.

Elderly Populations and the Digitally Challenged: The digital divide is often most pronounced among the elderly. There are others as well who feel digitally challenged. A voice interface:

Reduces the need to remember multiple steps or navigate complex user interfaces.

Can be designed to be patient, offering repeated instructions without frustration.

  1. Breaking the Literacy Barrier

Language Proficiency: In many developing regions, a substantial portion of the population may lack basic reading and writing skills. Voice commands:

Bridge the literacy gap, allowing users to simply speak rather than type or read

Can be adapted for dialects or regional languages.

Digital Literacy: For those intimidated by technology, speaking is more natural and less daunting than typing or swiping.

  1. Urban Informal Sector and Remote Rural Areas

Informal Economy: Millions operate outside the formal economy due to the complexities of formal financial systems.

Voice systems simplify banking, making it more inviting for those in the informal sector to migrate.

  1. Enhanced Process Literacy

Guided Transactions: A voice-based system can:

Walk users through processes, step-by-step.

Address queries instantly, reducing user errors and building confidence.

Personalised Learning: A sophisticated AI can:

Track user behaviour to identify recurring issues or challenges.

Offer targetted advice or tutorials.

  1. Multilingual Support

Beyond just multiple languages, a sophisticated voice AI can be trained to understand and respond to a wide array of accents, idioms and colloquialisms, making the platform truly inclusive. I have personally witnessed this in the various test runs.

  1. Security

Voice Recognition: Voiceprint technology can be integrated, making each user’s voice a unique password.

This, combined with other biometrics, can provide multi-factor authentication.

Real-time Fraud Monitoring: AI systems can be trained to detect anomalies in transaction patterns and can verbally alert users, ensuring faster response times.

  1. Cost Efficiency

Scalability: Digital voice systems, once set up, can cater to millions without significant infrastructure costs. The benefits are immense when this technology is mainstreamed.

Reduction in Errors: As AI assists and corrects users in real-time, the costs associated with transactional errors decrease.

  1. Education and Training

Voice interfaces can be used to conduct virtual financial literacy workshops or provide daily financial tips, fostering better financial habits.

  1. Feedback Loop

Traditional systems require users to type out feedback. In contrast, voice systems can capture spontaneous reactions and feedback, providing richer data for system improvements. This is a great advantage.

  1. Cultural Shift

As these systems become more prevalent, they can help inculcate a culture of technological trust and reliance – leading to broader acceptance of other technological innovations.

            Challenges Associated with a Voice-Led System

  1. Data Privacy and Protection:

Implication: Voice-based systems inherently require the collection of voice data, which can be as unique and sensitive as a fingerprint.

Regulatory Challenge: Regulators need to ensure that financial institutions safeguard voice data with the highest level of encryption, and do not misuse it for unsolicited profiling or marketing.

Possible Solution: Frameworks similar to the EU’s General Data Protection Regulation (GDPR) could be adopted, emphasising explicit user consent and stringent data storage and processing standards.

  1. Authentication and Security

Implication: Voice as a biometric means of authentication can be both an opportunity (adding a layer of security) and a vulnerability (risk of voice spoofing).

Regulatory Challenge: Ensuring that voice-based systems are equipped with anti-spoofing and robust multi-factor authentication mechanisms.

Possible Solution: Mandating the integration of voice with other biometrics or setting standards for voice recognition accuracy.

  1. Interoperability

Implication: With the proliferation of voice assistants like Siri, Alexa and Google Assistant, there’s a risk of fragmentation in voice payment platforms.

Regulatory Challenge: Ensuring seamless interoperability between different voice platforms and traditional banking systems.

Possible Solution: Setting standardised protocols for voice-based transactions, ensuring cross-platform compatibility.

  1. Accessibility and Fairness

Implication: While voice systems can enhance accessibility, they may inadvertently exclude those with speech impairments or heavy accents.

Regulatory Challenge: Ensuring that financial inclusivity principles are not compromised.

Possible Solution: Mandating financial institutions to have alternative accessibility options and continually refine voice recognition algorithms to be inclusive of diverse speech patterns.

  1. Consumer Education and Awareness

Implication: Voice-based systems introduce new interfaces and potential risks that consumers might not be familiar with.

Regulatory Challenge: Ensuring that users understand the technology, its benefits and associated risks.

Possible Solution: Promoting transparent user education campaigns and integrating real-time guidance within voice systems.

  1. Continuous Supervision and Auditing

Implication: The dynamic nature of voice and AI systems, which continuously learn and evolve, makes static regulatory checks insufficient.

Regulatory Challenge: Implementing continuous monitoring mechanisms to ensure systems adhere to standards.

Possible Solution: Developing real-time auditing tools and periodic AI behaviour checks, ensuring they don’t deviate from set ethical and operational standards.

  1. Liability Issues

Implication: In case of misinterpretations or unauthorised transactions, determining liability can be challenging.

Regulatory Challenge: Setting clear parameters on user and provider responsibilities.

Possible Solution: Drafting comprehensive terms of use and service agreements that specify conditions under which a financial institution might be held liable.

  1. International Coordination

Implication: Voice-based payment systems, to be integrated into cloud-based platforms, inherently will have a global reach.

Regulatory Challenge: Ensuring that voice-based financial transactions across borders adhere to multiple jurisdictions’ standards.

Possible Solution: Collaborative international regulatory frameworks and mutual recognition of voice-based transaction standards.


To summarise, a voice-based AI payment system has immense potential in redefining financial inclusion and enhancing both process and digital literacy. While voice-based payment systems herald a new era of convenience in financial transactions, they also introduce novel challenges for regulators. The delicate balance between promoting innovation and ensuring security, privacy and fairness requires both proactivity and adaptability from regulatory bodies. Collaborative efforts between tech providers, financial institutions, and regulators will be paramount in shaping a voice-driven financial ecosystem that is both robust and inclusive.

Leave a Reply