Reading Your Draft With Amazon Polly

I’m about three-quarters through the editing process for my second book, A Price to Be Paid, and I’m just about up to the point where I’d normally load the e-pub file onto my Kindle and read on the target device, making notes as I do so. I find it’s helpful to read the novel on a Kindle, as my eye spots errors more consistently on it than on my monitor.

I’m always open to new things, though, and I keep reading about the power of having your novel read back to you. It makes sense — if there are any issues with grammar, tense, etc. then having the novel read out should make these things stick out like a sore thumb.

I’m not completely sure about Windows, but I know that on Apple (both macOS and iPad/iOS), any text can be read aloud using the Accessibility — Spoken Content settings.

It’s all a bit naff, though; the speech isn’t particularly great, sounding neither natural nor flowing. It also messes up pronunciation with alarming frequency, and introduces weird long pauses after commas, etc. So while it technically works, I didn’t find it as helpful as I hoped, since I was continually taken out of the experience, which lowered my attention to actual issues.

So, I did a bit of digging, and found what is shaping up to be the perfect solution: Amazon Polly.



It’s basically a Text to Speech (TTS) engine on steroids, by introducing Neural TTS that’s way more lifelike and natural. It turns sentences with a question mark into the right inflection, gives the correct length of pauses that you’d expect from commas, full stops, semicolons etc. Not only that, but it even correctly pronounces my Scottish language, and gives the option to upload specific pronunciations if you prefer.

Getting up and running with Polly is as easy as creating a new Amazon AWS account (this is different to a standard Amazon account), and going for the free tier. Polly will read out up to a million characters per month using the Neural TTS voices, or 5 million on standard voices. For a frame of reference, A Price to Be Paid is just over 75,000 words, and contains 410,000 characters. So, I could do two full reads of my novel in any calendar month, as part of the editing process, all for free.

This free tier only works for the first 12 months, but the pricing seems very reasonable after that — the Neural TTS is priced at $16 USD per million characters, which, again, is probably two or three full read-throughs.

I’m super keen to find out how much could be automated — at the moment, I’m copying and pasting each chapter into Polly, saving to an .mp3 in an Amazon S3 bucket, renaming, downloading, then finally listening. I know that Amazon AWS has a powerful Command Line Interface (CLI), so it would be cool if I could figure out how to save a compiled Scrivener project to multiple .txt files, upload them to S3, run an automated Polly conversion and then download. Or even if there’s a way to add Polly support to existing software like Scrivener? Imagine being able to simply read out your chapter using the Polly engine while still being inside your writing software of choice.

Anyway, I’m off to listen to the rest of my novel. It’s being read to me in a very British accent, courtesy of the lovely Amy. Cheers.


Submit a Comment

Your email address will not be published. Required fields are marked *

Sign up for FREE content!

Sign up for FREE content!

Please subscribe for FREE exclusive novellas, deleted scenes and extended epilogues.

You have Successfully Subscribed!

Pin It on Pinterest

Share This