How Forgely converts your text to spoken audio, what controls you have, and what special markup options exist.
How does the Forgely text to speech tool work?
Forgely's text to speech tool uses the browser's built-in Web Speech API to convert typed text into spoken audio directly on your device. For Pro voice styles and higher-quality synthesis, it connects to Forgely's cloud TTS service. No audio files are generated on our servers for the free plan — playback happens entirely in your browser.
What do the speed, pitch, and volume controls do?
Three sliders let you independently adjust playback speed (0.5x to 2x), pitch (lower or higher tone), and volume (0 to 100%). These controls apply in real time, so you can fine-tune the output to match how you prefer to listen. Slowing the rate is especially useful for language learners or anyone following along with a transcript.
What are *emphasis* tags and how do I use them?
Wrapping a word in *asterisks* — for example *important* — signals the TTS engine to speak that word with added stress or emphasis. This mimics how a human speaker would naturally stress a key word in a sentence. Not every browser voice responds identically, but Chrome and Edge voices handle it best.
What are [pause] controls?
Inserting [pause] anywhere in your text adds a brief, natural pause at that point during playback. This is useful for mimicking the rhythm of speech, adding dramatic effect, or making long passages easier to follow. You can insert multiple [pause] markers in a single block of text.
What is live word highlighting?
Live word highlighting illuminates each word in the text panel as it is being spoken in real time. This helps you follow along, catch errors in your writing, or use the tool as a reading aid. It is powered by the browser's SpeechSynthesisUtterance boundary events and works automatically when you press Play.