Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Skip to main content

Amazon Polly - AI Voice Generator

Deploy high-quality, natural-sounding human voices in dozens of languages

What is Amazon Polly?

Amazon Polly is a fully-managed service that generates voice on demand, converting any text to an audio stream. Using deep learning technologies to convert articles, web pages, PDF documents, and other text-to-speech (TTS). Polly provides dozens of lifelike voices across a broad set of languages for you to build speech-activated applications that engage and convert. Meet diverse linguistic, accessibility, and learning needs of users across geographies and markets. Powerful neural networks and generative voice engines work in the background, synthesizing speech for you. Integrate the Amazon Polly API into your existing applications to become voice-ready quickly. 

  

Use cases

Capabilities

Amazon Polly has a variety of capabilities including some listed below

Lifelike voices

Deliver conversational user experiences in consistently fast response times

When requesting Amazon Polly output, you can choose from dozens of lifelike voices and various languages. Each voice is created using native speakers, with voice-to-voice variations even within the same language. Most languages include one or more male and female voices, so you can choose the best fit for your use case.

Missing alt text value

Customizable output

Customize and control speech output as needed

Amazon Polly allows you to create custom text-to-speech output that attracts and holds your audience's attention. Use custom lexicons to modify the pronunciation of acronyms, company names, internal terminology, or any other words you choose. Amazon Polly’s Speech Synthesis Markup Languages (SSML) tags also allow you to adjust emphasis, intonation, phrasing, and style. Generate voice AI output that best suits your business.

Missing alt text value

Gen AI power

Access built-in gen AI capabilities at a fraction of the cost

Amazon Polly supports multiple voice engines that you can choose from to convert text-to-speech. The engine deploys a billion-parameter transformer to generate voices in an incremental, streamable manner. This AI voice generator creates synthetic speech that is assertive, emotionally engaged, and highly colloquial, similar to a real human voice.

Missing alt text value

Control and security

Securely store and redistribute speech in standard formats 

Store your text-to-speech output in standard audio files like MP3 and OGG for redistribution, analysis, archiving, or any other use case at no extra cost. Cache your files for faster retrieval if needed. Your content's security, trust, and privacy are AWS’s highest priorities. Amazon Polly does not retain the content of your text submissions.

Missing alt text value

FAQs