Arabic-first AI: why Gulf-market AI needs to be built differently
Most Arabic AI is English-thought-in-Arabic-words. Here is what building Arabic-first actually looks like.
There is an Arabic AI problem that nobody talks about at conferences. Most of the Arabic-capable models on the market are trained predominantly on English, with Arabic stitched on through translation and post-training. When you put them in a real conversation, it shows. They translate English idioms badly. They default to Modern Standard Arabic (MSA) when every Gulf caller expects Khaleeji dialect. They cannot read the code-switching that is normal in Dubai business conversations.
Building for the Gulf market means treating Arabic as a first-class conversational channel, not an afterthought. In practice this means four things.
First, dialect handling. "What time can I come in tomorrow?" is said five different ways in five different Gulf countries, and none of them are MSA. An agent for a Saudi pharmacy should lean on Najdi and Hejazi usage. An Emirati clinic should sound like a Dubai receptionist, not a Cairo news anchor. We maintain a library of dialect-specific voice samples, acoustic profiles, and typical phrasing for each market, and we pick the combination per deployment rather than shipping "Arabic" as a single setting.
Second, script handling. Arabic script is right-to-left. Mixed Arabic-English messages need to render correctly in the chat UI, in the CRM notes, in the transcript, and in every downstream system. More than one of our projects has had a silent bug where names with Arabic characters were being stored with visible Unicode direction markers — data that looks fine on the screen and breaks every filter in the backend.
Third, respect for formality. "Please" in Arabic is not one word. The same sentence spoken to a patient's elderly father versus a young employee uses different grammatical forms. The agent needs to read the conversation and adjust register. This is a prompt-engineering problem more than a model problem, but it takes work per-client to get right.
Fourth, code-switching. A typical Dubai call sounds like: "Habibi, yeah I need to book, uh, Thursday afternoon if possible, inshallah." Half English, half Arabic, embedded cultural markers. A model trained to respond only in English will do the polite thing and answer in English. A model trained to detect and mirror will respond in the same register, which is what a human receptionist would do. The latter builds rapport. The former feels robotic.
How we build for this: we layer three things. A base model that is genuinely multilingual (we mostly use Claude and GPT-4o-class models — they are honestly good at this now). A voice provider where we have tested the specific dialect output (ElevenLabs with a cloned voice from a native speaker usually wins). And a per-client system prompt that encodes the client's own tone, typical customer profile, and formality rules.
The models have gotten good enough that Arabic-first AI is now a question of doing the preparation work, not a question of whether the technology can handle it. Anyone selling you "Arabic AI" without asking which dialect, which formality register, and which code-switching patterns your customers actually use is selling you English AI with a translation layer. Push back.