How Amazon Search Really Works (and How This Helps You Sell More Books)

October 10, 2018 Chris Sim

This post takes a deep dive into how Amazon's search engine works, and explores ways to use this knowledge to help you sell more books online.

Why a 500 keyword character limit is costing you book sales.

August 16, 2018 Chris Sim

I bet that if you’re reading this, you’ve always accepted 500 characters as the gold standard for maximizing book visibility through search - work your way up to 500 characters and you’ve joined the metadata elite. You've surpassed the majority of publishers who add only two or three hundred keywords, or gasp, don’t add keywords at all. Well, dear reader, if your ONIX makes its way to Amazon, and Amazon is an important retailer for your sales, I have some news for you:

If you only send 500 keyword characters to Amazon, you’re probably missing out on sales.

Contrary to what you may have been told, Amazon accepts and uses far in excess of 500 characters per book. In this post we’ll uncover the origin of the 500 character limit, and provide actual hard data that shows Amazon does indeed accept keywords far beyond the 500 character limit. (In case you’re wondering why more keywords are better, the short answer: more keywords = more chances for a book to be found in search = more sales opportunities. A longer explanation is available here).

The "Best Practices For Keywords" Recommendation

The BISG’s metadata keyword best practice standard states a recommended limit of 500 characters. The guide is a fine document, and I highly recommend you give it a read in case you haven’t come across it before. I was a member of the committee who authored the document and contributed to it's content and discussions. If you’re tasked with “creating keywords for online retailers”, you should absolutely use this resource as a guide. When it comes to keywords destined to Amazon, it’s worth noting that: Amazon wasn't involved in authoring the standard, and the “best practices” are a general recommendation to the industry as a whole. They’re not prescriptive, and retailers can implement anything they wish to. In fact the committee, correctly, ensured they weren’t too Amazon-centric in their recommendations. Long story short - even though 500 characters is the “recommended” limit, it doesn’t require Amazon or any other retailer, to strictly abide by the recommendation.

Where Did The 500 Character Limit Come From?

I asked this question on the committee and the best answer I received was: “it’s always been in the ONIX standard”. As someone who has lead teams to build book search engines, I wanted to understand what the technical rationale might be - was there some data science based research that underpinned this widely accepted “best practice”? My search led me to the good folks over at EDItEUR who oversee the ONIX standard globally. From my exchanges with these publishing metadata veterans, I learned that:
- The keyword character limit as it’s defined in ONIX, has always been a “suggested limit”.
- The ONIX standard places no limit on how many keyword characters an ONIX sender can transmit.
- Publishers should rely on receivers (such as retailers) accepting at least 500 characters.
- The limit used to be 100 characters and was then raised to 250 before the current suggestion of 500.
- The 500 limit considers old library systems that may not have the technological resources of retailers, and therefore are limited in what they can receive.
Again, a sensible limit to accommodate myriad publishing systems of various sophistication levels and vintage. But also, still nothing prescriptive about what a receiver (retailer) should do with a keyword field, other than accept at least 500 characters.

It's Easy To Test The Limit, But You Have To Do It Properly

Search technology is complex and doesn’t operate in the same manner as say, a bank transaction system. If you login to your online banking website and send money to someone, they better receive it. Every cent needs to be managed, accounted for and then auditable by all parties involved. Search engines are somewhat different. When a search engine indexes data, while it may read everything available to it, the use and visibility of the source data comes down to a whole host of factors which is far beyond the scope of this post. Suffice to say that if you supply Amazon with say 100 keywords, it will read all of them, but won’t necessarily use all of them for your book (understand more about this here). If your character length test involves typing in each of your keywords into Amazon search, paging through countless results to check for your book, then using the absence of your book in results as evidence that the keyword wasn’t indexed - you’re doing it wrong. (We haven’t even touched on partial term matches). Each keyword is an opportunity, not a guarantee.

Cold, Hard, Data: How Many Keyword Characters Should I Provide?

Running a test to correctly identify whether a keyword improves search rank, involves analyzing hundreds of books, removing words found in the title, author names and category (BISAC/browse node) names from test queries and also searching for combinations of individual keyword terms (as a metadata expert, you already know that exact keyword matches are the tip of the search iceberg). We did all of this and here’s what we found:

Amazon indexes at least 1500 keyword characters from the keyword field in an ONIX file (or uploaded directly using Amazon's internal tools).
Kadaxis clients receive up to 1500 keyword characters, so this is the limit we tested. (On average, our clients receive an average of 1000 keyword characters per book). We found books matching search queries all the way up to the high 1400s character count after stripping away other metadata data we know is indexed. This approach gives us confidence that the book’s presence in search is attributed to the keyword and not other sources (like BISAC names).

30-50% of searches matched keywords were found in the 500-1500 character range
Said another way: if you’re only adding 500 keyword characters, your book is missing out on matching to 30-50% of search queries, than if you’d used 1000+ characters.

Books with 1000-1500 keyword characters match 67% more search queries, compared to books with 500 characters or less.
One of our tests compared 100 books from two trade publishers - one with Kadaxis keywords, and one who used an alternative service that maxes out at 500 characters. Both sets included a mix of good selling fiction and non-fiction titles and were run through identical measurement systems. The Kadaxis publisher had an average of 1098 keyword characters (max 1500) which matched to 67% more search queries than the publisher who added an average of 446 keyword characters (max 500).

Let's look at a couple of examples to illustrate further:

Title: Medical-Surgical Nursing Made Incredibly Easy (Incredibly Easy! Series)

This medical text matched numerous keywords in search, but one example of note is the keyword phrase "advanced pathophysiology". Neither of these terms are found anywhere in the book's metadata (as an aside, while this keyword is also not in the description text, know that the description isn't indexed for Amazon search, but that's a post for another day).

The keyword itself is present in the keyword field at character position 964 (out of 1079 total keyword characters). We can find the book in the Books search engine on Amazon for the search query "advanced pathophysiology, wedged between two other pathophysiology books. Note, the term is relevant for people interested in the topic, as evidenced by review mentions.

Let's take a look at one more example:

TITLE: The Greatest Story Ever Told--So Far: Why Are We Here?

Again, this title matches numerous keyword derived search queries, but we'll focus on one keyword: "heisenberg uncertainty", which refers to Werner Heisenberg's Uncertainty Principle. This keyword isn't mentioned anywhere in the book's metadata, but is mentioned several times by readers, including examples where "The Greatest Story Every Told" helped readers to better understand Heisenberg's Uncertainty Principle.

The keyword "heisenberg uncertainty" is present in the book's keyword field at character position 1101, and the book is found in the search results among related titles.

A few other examples, from hundreds in our set:

Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money, Business, and the World. Keyword with search match: "smart contracts", character position 1142.
The Perfect You: A Blueprint for Identity. Keyword with search match: "neuroscience", character position 1267 (actually matches 5 search queries about neuroscience).
Darkfever (Fever Series, Book 1). Keyword with search match: "male characters", character position 1095.

What does this all mean? More keywords equals better search visibility, which equals more chances to sell (after all, most books are sold through Amazon search).

Keyword ROI: It’s Worth It

1000 characters is a lot of work, 1500 characters even more so, but compared to the rest of the effort and expense that goes into making a book a success, it’s a comparably small investment for the potential upside - especially when you consider the compounding value better search visibility earns a book. If you have questions about how we conducted our tests, how you can replicate our results or to learn more about our keyword services, please get in touch via the Contact form.

Thank you to the folks at Firebrand (Catherine Toolan, Steve Rutberg and Joshua Tallent) for their help with this article.

A Beginner's Guide To Author Marketing Through AMS

July 23, 2018 Victoria Greene

Through the wonders of modern technology, though, there is a great degree of flexibility in how you can approach your marketing, and there may be no better platform than Amazon for getting eyes on your work. It’s the biggest ecommerce site in the world, with book sales that bring in billions of dollars each year, and the Amazon Marketing Services (AMS) system makes it possible to serve highly-targeted PPC ads to its massive global audience.

Measuring Keyword Effectiveness on Amazon

January 8, 2018 Chris Sim

So you've found the perfect keywords for a book, how do you know if they're effective? Off-page keywords aren't visible to potential customers, so assessing whether they'll work or not takes a completely different approach to assessing visible metadata (title, description, etc.) about a book. The purpose of a keyword is to help a search engine return a book to a customer in a set of search results, in response to a search query. If a keyword doesn't achieve this one task, then it is of no value to you. It doesn't matter how beautifully descriptive or categorically accurate a keyword is, if it doesn't help your book show up in search, then it is worthless. Public data, such as categories or subtitles are dual purpose - they can improve your book's search presence and also help to convince a searcher to buy your book.

With an understanding of an off-page keyword's purpose - how do measure it's effectiveness on Amazon search? Amazon doesn’t share much data in the way of search metrics and even the data in Amazon's Retail Analytics (ARA) is pretty limited. But it’s still possible to measure whether a book has poor or good search visibility by analyzing public data. The most rudimentary method is to type every keyword into search, and see if the book appears, or ranks, in the search results. If it doesn’t show up in the results, then it has no visibility for that search query and it's effectiveness is nil. It’s pretty simple.

If a book does show up for the keyword, where does it rank? By rank, we mean the position it holds in the search results. Is it number one? Number 5? Does it show up on the first page, second page? Search rank can have a huge impact on conversion. Customers are significantly more likely to click on the first few results of a search query, than further down the page. Techniques such as Discount Cumulative Gain to measure search quality are predicated on this assumption.

Beyond simple one-to-one keyword to search query testing, to get a true appreciation of a keyword’s impact, you also need to test for derived keyword combinations. As an example take these three key phrases: “ya romance”, “contemporary romance” and “thriller”. These keywords will also match to the search queries: “ya contemporary romance”, “romance thriller”, “contemporary romance thriller” and so forth. Books are also matched against search queries that only partially match keywords. Using the same example, the book might also match “romance suspense” and “ya paranormal romance” even if “suspense” and “paranormal” weren’t specified as keywords for the book. To truly figure out how well a book ranks in search, requires figuring out all of these combinations, then running searches for each of them.

Once you find a match for your book, you then need to calculate how valuable the search query is, as they’re not all equal. Ranking in the top 5 for “romance” will generate more traffic for a book than ranking in the top 5 for “cozy beach romance set in florida”. The latter search query is more specific and while it receives lower search volume, it may have higher conversion potential. Long-tail searches are more granular, specific and easier to rank for, and are likely to generate better leads. Publishers almost never include them.

On Amazon, the majority of customer searches are long-tail - in fact a large percentage of searches have only been seen a couple of times or less. A significant percentage of these search queries on Amazon each month, have never been previously observed.

With the knowledge above, it's quite easy, albeit time consuming, to measure the effectiveness of a keyword for a book on Amazon (or any other search engine for that matter).

How A Keyword Sells A Book On Amazon

November 8, 2017 Chris Sim

In order for a keyword to create a sale through Amazon search, two events need to occur:
1. The book needs to be present in the search results (or rank) for search queries matching the keywords added to the book.
2. The book needs to convince customers (or convert them) to click on a search result, add the book to their cart, then check out (or, make the purchase).

Each keyword creates an opportunity for sales to be made, as each keyword provides the Amazon search engine with an additional way to surface a book to customers. The more keywords assigned to a book, the more data the search algorithms have to work with. Not every assigned keyword will be matched to customer search queries, but you can think of each keyword as asking the search engine: have you thought about showing my book to customers who search for this query? It's up to the search engine to decide, but the more questions you ask, the more likely it will find more ways to return a book.

Behavioural Influence On Search Rank

Once keywords are ingested by Amazon, they’re subjected to a whole lot of parsing and processing, and testing with customers.

First, let’s remember that the goal of a retailer search engine is to maximise revenue, not help people find information (like on Google). This means that on Amazon, discoverability and sales are tightly linked. The rationale behind this is pretty straight-forward - if you show the books most likely to sell, more often, you’re more likely to sell more books.

Every time Amazon displays a book to a customer in search, the customers’ action is logged. If the action is positive, such as a click on a link to view the product page, a cart add, a purchase or a read on Kindle, it positively impacts the books search rank. The more positive actions a book accrues, the more visibility it’s rewarded with.

Conversely, if customers don’t respond in a positive manner to a book in a search result, then the book will eventually be outranked by other books that customers respond more positively to.

For example, if a book is shown in search, and the books above and below it are clicked, and not the book itself, this is regarded as a negative signal and will mean the book-query pair will be weighted downward. The book’s search rank for the query hasn’t encouraged positive behaviour so it is penalized.

A simpler way to think of this is - if book A sells more than book B when shown in search, keep showing book A ahead of book B, because book A is making more money. Let’s also try putting book C ahead of book B, and see if that book makes more money instead.

The ability to maintain and grow search visibility is dependent on the book converting through this funnel of customer actions. Sales are the ultimate goal for publishers and retailers, and are a strong search magnifier, but these sales are dependent on conversion.

All of these scores attributed to customer behaviour decay over time, which means momentum matters. It’s easier to improve the discoverability of a book that is already selling and converting well, using keywords, than it is to improve a book starting from low sales.

Conversion And Social Proof

We've all searched for books on Amazon before - you type in a search query and are presented with a list of books. Perhaps you were looking for an exact title, or perhaps you were searching for books about a topic - in either scenario, you might have been swayed to click on a search result if it looked interesting to you.

Think about the factors that persuaded you to click into the book's product page from search. If you were encountering the book for the first time, the only information you would have at that point was the data visible to you in the search result - cover, title, subtitle, price and importantly: the review count and average rating. These last two elements capture the social proof of the expected reading experience.

Social proof is one of the most powerful on-page converters, as customers often trust each other more than the organization trying to sell to them. Reviews are a book’s public record - they convey more value to a reader than whether book it has sold well, and let them know what other people thought about their experience (good or bad) and why.

Reviews are a permanent, mostly unbiased and irrefutable history of a book by it’s readers. And unlike search rank, the social proof from reviews is cumulative and compounds over time - it doesn’t really decay.

Once you have 500 reviews you’ll always have at least that many reviews. With an established positive star rating across a large base of reviews, it’s also highly likely future reviews and ratings will follow a similar pattern, further strengthening the book’s ability to convert.

In a future post, we'll analyse some data to see how social proof relates to search visibility on Amazon, including examples where Top and Recent customer reviews have been impactful. But in the meantime, thinking about a keyword's place and ability to influence a sale in the search funnel, will help you to focus on which element of metadata to prioritize to maximize sales potential.

BISG 2017 Annual Meeting - Rights, Metadata and Marketing Panel

September 29, 2017 Chris Sim

I participated in a panel on "Rights, Metadata and Marketing" at the recent BISG 2017 Annual Meeting, held at the Harvard Club in New York City. Here are my responses to the questions I was asked:

What is currently working for Kadaxis?

Our approach combines machine learning techniques with a deep knowledge of Amazon search. We take a digital marketing approach to keyword creation by understanding how readers search for books. As a result, our publisher clients are experiencing success by using our keywords, and have seen how the right keywords can directly lead to an increase in search visibility.

Creating keywords by hand isn't hard, creating keywords using algorithms also isn't particularly difficult, but creating keywords with an understanding of how a specific search engine (Amazon in our case) uses them can be challenging. This difference in platform optimization is key in creating keywords that have an impact.

What trends do you see in rights, metadata, and marketing?

A strong trend in moving away from the traditional gut instinct approach to around metadata content curation and marketing, to decisions backed by hard data. The most effective data engineering we've seen by publishers, are from those who iterate over metadata changes quickly, then use tools to measure the impact. As a result, their internal expertise and specialisation increases, leading to significant improvements to online visibility and sales.

What's not working as well, or where would you like help? What are the persistent problems you think the industry needs to solve?

Keywords have seen a significant increase in visibility for publishers this year, which has meant a significant increase in queries and interest in our service. But as with any new solution or technique, it's human nature to look for a silver bullet to solve a problem (in this case to boost search visibility and sales). Part of our engagement process for new clients is to set expectations that creating impactful keywords requires time and focus. While we can scale retail SEO expertise to work with tens of thousands of books at once, not every book will be boosted equally. Metadata optimization works best when using tools such as audience driven keyword analysis, but it can take iterating over the process to find what works best. A single metadata change is almost never the single piece of the puzzle for a meaningful increase in sales.

Platform optimization is key here also - many companies in the space have come and gone, making the same mistakes of extracting keywords from the content of the book, ignoring the audience and not optimizing for a platform - for us we focus on Amazon search. Creating keywords from a body of text isn't technically challenging, but creating keywords that have a high probability of working on a specific platform is our goal, which can be at odds with a gut instinct approach to keyword optimization.

We've also worked hard on helping to educate publishers on the importance of measuring keyword success - many publishers take the "set, forget and hope" approach to metadata optimization. Without a methodology in place to measure and understand how changes impact a book's performance, it's impossible to know if the changes made a difference or not, and how the process can be improve upon.

How can BISG help in these areas? What should we be thinking about for 2018 and beyond?

Publishers strengths have always been in identifying the content and creating a product that resonates with an audience. This is the traditional art, the skill that hasn't changed and I don't think it will or needs to. What has changed is how people find books - search and recommendation engines, social media, deals, and so forth. Audiences are reached using these newer systems through data and are the perfect target to apply data analysis techniques.

The better data a publisher has about an audience - the better it can target them.

Think about how other types of data might be shared and accessed and eventually monetized - beyond book metadata. The more you understand an audience, the more likely you are to reach them. Consider all the rich data about reader interests and behaviour that exists online - think book bloggers, email list owners, retailers, and so forth. If this data was captured and available to publishers in a standard format - data owners could monetize their data while publishers and other service providers could access powerful insight into audiences in a standard (potentially real-time) method at a low cost.

Creating a standard around how this data might be shared would be powerful and create significant value for audience data owners and consumers.
Ideally if such data were decentralized and made available on a blockchain, it could facilitate the next generation of market intelligence and discovery services in publishing.

If you could ask the companies represented by the people assembled here today for help in one area, what might that be?

Share with us how you're experimenting with data to understand who a readership is so we can understand and learn with you. Publishers often aren't given enough credit for being innovative. During my time in the industry, I've learnt an incredible amount from many smart publishers and always welcome the opportunity to understand how publishers are redefining marketing.

Machine Learning and Bestseller Prediction: More Than Words Can Say

September 7, 2017 Chris Sim

There’s been much recent conjecture on whether book sales can be predicted by text analysis alone. My company, Kadaxis, has dedicated the past few years to machine learning research and product development for the publishing industry. In our early days, we set out to build an algorithm to predict bestsellers, and tested it in the wild. In this post, I’ll share my perspectives on why the text alone isn’t enough.

If You Publish It, Will They Come?

To predict book sales, you need to account for the factors that influence book sales. The text of a book is core to the product, but many other factors, such as sales and marketing, influence whether a customer will discover and buy it. An algorithm predicting book sales using only the text as input will only work in a book market meritocracy, where the best-written books always sell the most copies.

Author platform (brand awareness) is one such non-text factor that influences sales, as in the following examples:

– The Cuckoo’s Calling hits the top of Amazon’s bestseller list only after Robert Galbraith is revealed to be J.K. Rowling.
– Amy Schumer’s memoir, The Girl with the Lower Back Tattoo, hits the New York Times bestseller list in its first week—an improbable feat without her strong personal brand.
– Dave Eggers publishes multiple books and receives several award nominations prior to releasing bestseller The Circle, and does so after having appeared on television numerous times.

Even amongst well-discovered books, the relationship between reader satisfaction and sales volume can be tenuous. Consider Harry Potter and the Cursed Child and Go Set a Watchman, books that have sold millions of copies each, but have achieved star ratings of 3.6 and 3.4 on Amazon, respectively—scores below the average indicator of satisfied readers.

Many other factors might also influence book sales, such as the editorial process, cover design, marketing budget, seasonal trends and book metadata. A machine, just like a human, needs to consider which of these factors will make a book sell more, to make an accurate prediction.

Machine Reading

Assume for a moment a linear relationship exists between reader satisfaction, discoverability and sales (i.e. the best written books are found the most often and sell the most copies). In this author’s utopia, we can reliably predict sales volume directly from a book’s text, as long as we can measure what’s important to readers. As products go, books are nuanced and complex, and the reasons why they resonate with us are also complex (compared to, say, a toothbrush). How do we uniformly distill the unique traits of a book into data?

This is, of course, where machine learning helps us. One approach, which is also the method used by the authors of the much-talked-about The Bestseller Code, is topic analysis (or latent Dirichlet allocation). This technique allows us to define a book in terms of how much of a topic it contains, such as “Homicide – 8.7 percent.”

If you’d like to see the data a topic model creates, you can view an example from our systems here (or upload your own book for analysis at authorcheckpoint.com). Topic modeling gives us a good snapshot of the content of a book, and allows us to make apples-to-apples comparisons between them. It is also useful data to use as input to training a predictive algorithm.

The Curse of Dimensionality

Our machine reader might define thousands of topics for each book we’re analyzing. While more data points might seem like a good thing, the more we add, the more books we need to read in order to make reliable predictions. If, for example, we had 2,500 different data points about a book, we’d likely need several tens of thousands of books to be confident our algorithm is accurate. Even 20,000 books (the data set used in The Bestseller Code) is likely far too few books, and puts us at risk of the curse of dimensionality.

(A quick tech side-bar: even with cross-validation we’re still likely overfitting our data, and hold-out is no guarantee against this, especially when using heavily unbalanced classes such as “bestsellers” for classification.)

Too many data points, and not enough books, means our algorithm will probably find patterns to say whatever we want them to say. The patterns exist in the data, but they aren’t representative of the real cause of what we’re trying to predict. In the world of black-box trading systems, this phenomenon is well-known.

So is there value in analyzing the intrinsic qualities of books such an algorithm might identify as selling well? It might be an interesting exercise, and the similarities the algorithm finds might make sense to a human observer. But you couldn’t reliably conclude that those similarities were the reason the books both sold well. In a contrived example, we might conclude that books with a red cover, 250+ pages in length and featuring a dog instead of a cat, will sell more copies than those without.

There is, of course, a simple way to prove the efficacy of any predictive model, and that is to apply it to new, unseen books before publication.

Predicting What’s Important

Even with access to enough books in our author’s utopia, we of course need a reliable metric to measure. Bestseller lists are a weak proxy for actual sales volume for many reasons, not least for the fact that they reflect “fast sellers,” meaning a book on a list may sell less overall copies, over time, than a book that isn’t.

But rather than searching for a magic formula to help move more copies of a book, a more valuable and attainable goal is to solve for reader satisfaction. By tying together data about the content of a book, with data capturing a reader’s reaction to it (beyond tracking where they stopped reading), we can begin to understand the true impact a book has on a particular audience and why. Armed with this insight, we can better match books to readers (recommendation systems) and books to markets.

This article originally appeared on the DBW blog September 28, 2016

7 Author Promotion Strategies You Can Learn From

July 31, 2017 Victoria Greene

This is a guest post by Victoria Greene, a freelance writer and branding expert.

In an ideal world, you will already be looking for ways to promote your book as you’re writing it. After all, you’ve already gone to the trouble of considering your audience and the style of writing they like, as well as analyzing the characteristics of your book’s chosen genre. But in terms of nitty-gritty marketing techniques, here are 7 author strategies you can learn from to promote your book successfully.

1. Make Your Website The Center Of Your Brand

Your website should be the central hub of your promotional operations. But for many authors, the thought of creating a fancy website can be daunting. By using a hosting platform like Squarespace, Shopify, or WordPress, authors can knock up a decent-looking website in a matter of hours – no previous experience in web design necessary. There are many basic templates for small brochure sites; you can also access hosting and all the tools needed to sell and promote your book online.

A basic author website should include:

●   A homepage featuring your book’s cover image, synopsis, and link to buy or download
●   An About page including your background, an image, and contact details
●   A regularly updated blog
●   Links to your product listings on Amazon, Kobo, etc.
●   Book reviews
●   Links to your social media channels

2. Build Real Relationships

Writing is often tipped as a lonely profession. But in order to make it nowadays, no author is an island. The relationships you make both online and in person will make or break your career.

Finding influential voices in your genre or subject matter is easy thanks to social media, but being heard and remembered by these influencers is a different story.

Obviously, trying to get in touch with J.K. Rowling on Twitter isn’t going to be a successful strategy, but starting off locally might be. Find writing groups and authors in your area and make a connection. In building authentic relationships, you will need to take the time to build your own genuine interest in other people’s work. Attend writing events and offer to guest post on another author’s blog every once in a while.

3. Think In Keywords

As the biggest bookseller online, Amazon has refined its internal search engines to the nth degree. Its use of metadata and keywords helps millions of customers find the books they want to read in seconds.

Authors should, therefore, consider their book’s keywords very carefully, in order to draw the most traffic to their page. Remember, it is not just about ‘stating the obvious’ in terms of naming your genre or targeting highly competitive search terms.

To avoid common Amazon keyword mistakes, you should firstly think about your readers and the kind of words they might use to find your book. If non-fiction, might they be searching Amazon for a book that solves a problem? Similarly, if your work is fiction, does it cover any widely discussed or topical themes, such as environmentalism? For more on readers’ search intent on Amazon, take a look at this previous post.

4. Boost Your Email List

Your email marketing strategy could be paramount to your long-term success as an author. Developing a long list of subscribers isn’t easy, but these regular readers are crucial to building up your core fanbase.

Sites like Twitter and Facebook change their terms of use intermittently. This puts writers at the mercy of moderators. Should their sites ever fail, or your profile get removed, you will have no other way of reaching your subscribers from these sites. Adding a link to your mailing list on these profiles will help your core fans stay in touch and engaged with your brand.

Further, make sure you provide your email subscribers with remarkable and exclusive content from time-to-time. For example, a free supplementary eBook download could be a great way to build up anticipation ahead of a new book launch.

Email lists are also one of the cheapest forms of direct marketing available to authors. In order to build a list that drives significant sales traffic, you need to promote and talk about it often within your blog posts and social media updates.

Credit: Maliha Mannan via Unsplash — *Credit: Maliha Mannan via* *Unsplash*

5. Gather Reviews As Early As You Can

Ideally a good few months before your book is released, you will need to begin the process of gathering reviews. In order to do this, you will need to prepare some ‘Galley Proofs’ of the first couple of chapters in a PDF. Don’t be afraid to approach many different industry voices, publications, and even news publications to get a review. Even if the review is only couple of sentences, you can use this as part of your publicity material.

6. Don’t Ignore Facebook

You should already be using social media to connect with interested readers. Through online communication there is ample opportunity to promote your book, without relying on the typical front cover image and pitch.

Since everyone and their mother (often literally!) use social media these days, authors ignore sites like Facebook at their peril. There are numerous creative ways to bring up your book’s content, without expressly plugging it.

For instance, if there is a news story that closely mirrors some of the issues the characters in your book face, share this and ask your readers for comments. If your protagonist battles with depression, for example, the latest research findings could easily be related to your readers. Further, if you do this in the development stages of your book, your reader’s observations may help guide your character’s behavior and make them more believable.

Millions of authors use Facebook as a tool for promotion, but did you know that you can also use the social media site as a direct selling platform? Using Facebook’s marketplace, you can safely and securely sell copies of your book from your author profile page.

You will need to set up a separate Facebook page under your author name, rather than your personal profile. Depending on the ecommerce selling application you opt for, you can very quickly and easily make your own online store for your books, with very little technical know-how required. So why not make the most out of this traffic?

7. Generate Sales Fast With A Promotion

Amazon sales are based on rankings. Books at the top of the rankings are more likely to be discovered by potential readers. So it should be the aim of every author to rank as well as they can.

There is no surefire way to cheat Amazon’s algorithmic ranking system. However, generating many sales over a short time period is bound to contribute some positive traction for your book’s ranking within the site.

Offering a large discount over a 48-hour period is a great way to generate fast sales in the early instance of a book launch. Also, be sure to ask these discount readers for a review, so more readers can look you up online.

You should also blast the news of your sale on social media sites and use a book promotion hosting service, such as Bargain Booksy, to help you get the word out to as many people as possible.

Marketing should infuse everything you do as an author, from designing your front covers, to putting your best foot forward with your social media posts. There are many tricks of the trade that can help you boost your website and sales rankings. But ultimately, your success will come down to the amount of genuine effort you’re putting into promoting yourself and engaging with readers.

Victoria Greene is a freelance writer and branding expert. She runs her own blog at victoriaecommerce.com. Here she dispenses advice to authors and brands looking to boost their sales online.

How Do Keywords Impact Sales?

May 22, 2017 Chris Sim

The question I receive most often from publishers is: “How do keywords impact sales?” While adding keywords to book metadata is considered best-practice, publishing businesses are naturally more interested in whether the practice will increase revenue. Keywords in this context are ‘off-page’ keywords, which are sent to retailers in an ONIX feed or added to a book via KDP, Amazon’s dashboard for Kindle books. Keywords aren’t visible to customers, but are indexed directly by retailer search engines (such as on Amazon), and allow publishers and authors to influence how readers find their books online.

At Kadaxis, we’ve added keywords to thousands of books, on behalf of a wide variety of publishers, and while some titles have seen significant short-term sales improvement, in most cases, publishers observe an average overall increase across a portfolio of titles over time. In this post we’ll cover the relationship between search traffic and sales, and outline how the title selection component of a keyword strategy can have an impact.

Keywords Direct Online Shoppers To Books

When purchasing a book online, a customer, can take many paths in a session of book browsing. We’ve isolated one path for discussion. A typical path a customer might follow involves:

Typing a search query
Viewing a list of search results
Clicking on a book
Viewing the book’s product information
Making a purchase

Keywords can assist at the start of this flow, by helping books to appear in search results more frequently. But getting customers from search result to purchase is dependent on their previous exposure to the book, product information and other factors. Readers need to discover a book three times before they’re ready to buy, says Peter Hildick-Smith of the Codex Group, and ranking in search presents them with that option.

But the final responsibility to sell the book sits with the book’s product page. The stronger this page is the higher the likelihood of converting search traffic to sales. Some factors include and appealing title and cover, well-written descriptions, and positive customer reviews. A book with a bland, wordy description and a low count of negative reviews is unlikely to yield much return from adding keywords.

Sales Leads to Discoverability Leads to Sales

From a publisher’s perspective, a keyword’s core utility is to direct search traffic to books in the hope of selling more copies. If excellent, reader-focused keywords are assigned to a book, these keywords will only serve their function if the book appears in the search results of customers searching for books by those keywords. If the book doesn’t rank for those keywords, they are of no value.

So how do you determine whether a book will rank for its assigned keywords? The best predictor is sales. We consistently see a correlation between sales and the number of keywords a book ranks for: higher selling books also rank higher in search results. Generally, the more a book sells, and the more recently those sales occurred, the more discoverable it will be.

It can be insightful to examine the intent of different search providers when understanding how search works. Ecommerce retailers, such as Amazon, use search to sell products, whereas search companies, like Google, use search to help people find content. The focus on selling in retailer search can strongly influence how discoverable books become. (See also: How do Amazon and Google use my book metadata in search?).

For many reasons, products that have sold well in the past have a high chance of selling well in the future. Amazon exploits this phenomenon in search (and across their site), by boosting the visibility of higher selling books in an attempt to maximize sales. They understand that the odds of a sale are higher if a customer is presented with a popular item, so search results are reordered based on sales data (and other signals, such as page views and conversion rate). This means even the most well reasoned keywords might not have any impact for some books, but for others, they’re afforded the opportunity to rank for disproportionately more search queries.

Maximizing Return Through Title Selection

The myriad factors influencing search visibility, conversion and buyer sentiment, make it challenging to determine which books will benefit most from keywords. But since the endeavor is relatively low cost compared to rewriting jacket copy or updating a cover, and the possible return is high, the most prudent strategy to maximize ROI is to add keywords to a number of high potential titles.

Tying the concepts above together, this means selecting titles with:

A high chance of converting: books with good publisher-provided metadata (to assist customers in their buying decision) and customer-created reviews and ratings (social proof).
A high chance of ranking in search: typically books with a solid sales history, ideally performing above the competition, with recent sales valued more highly (or pre-promotion).

Recurring ROI

Titles that respond positively from keywords will experience increased sales over time, while maintaining search visibility and accumulating social proof, criteria which positively reinforce each other. But this can take time to build, and the rate of improvement varies for different genres, audiences, titles, and is heavily influenced by the prevailing zeitgeist of the moment. It’s not uncommon for titles to “tip” after several months of gradual improvement, which is why it’s best to adopt a medium to long-term outlook for any keyword strategy. But once the right keywords take effect, the return can persist long after the keywords were put in place.

As with most sound marketing strategies, keywords aren’t a silver bullet to an overnight improvement in sales. But when applied strategically across a quality catalog, they can significantly impact discoverability, leading to an ongoing recurring increase in sales over time.

This article originally appeared on the DBW blog May 22, 2017

Who Uses the Keywords in Metadata?

March 4, 2017 Chris Sim

We often hear that keywords are important to help readers find and discover books. But what does that mean, and do keywords actually make a difference? In this post, we look at how keywords are used to search book websites (in particular, online booksellers), and their adoption by publishers. For this investigation, I had help from Pat Payton (Bowker) and Catherine Toolan (Firebrand). We set out to answer the following questions:

• Are publishers adding keywords to book metadata?
• Are they providing quality keywords?
• Do online booksellers use keywords in their search engines?

In this post, “keywords” refer to consumer-oriented terms to describe a book that are added to an ONIX feed and sent to third parties. These terms aren’t seen by the public and are primarily used for search indexing. Conversely, web search engines (such as Google) don’t make use of ONIX keywords, but analyze the text of public webpages to create search indexes. As book content isn’t public, search providers rely on metadata to help consumers locate books.

Keywords Help Consumers Find Books

Most retailers solve the simple use cases of finding a book by title, author or category. Many searches, however, are comprised of natural language queries that describe different elements of a book, such as its setting, characters, theme or an emotional response to its content. Keywords were designed to fill this gap, by allowing people knowledgeable of the book to specify additional terms by which to find it.

Books are multi-dimensional, complex products that are typically highly nuanced and represent multiple buy trigger points for different types of consumers. Books have much more depth than, say, a kettle or a toothbrush, and determining the best keywords is therefore proportionally complex.

Note that extracting keywords from the book’s text is a naïve approach to solving this problem. The most effective keywords relate to a reader’s experience with a book, and the language she uses to describe it.

Are Publishers Adding Keywords to Their Books?

Bowker analyzed the keywords added to ONIX files from roughly 150,000 publishers, which included reprint and self-publishing service providers to university presses, trade, school and audio publishers. Of these publishers, about 23,000 (15.3 percent) had added keywords to at least one book. And of these, smaller publishers (less than 100 titles) typically had a higher percentage of keyword coverage than did larger publishers.

Over the past 10 years, though, publishers have increased the number of titles with keywords from approximately 25,000 to approximately 114,000, in 2015. But this number is still a very small proportion of all books available.

How Sophisticated Are Publishers’ Efforts to Choose and Maintain Keywords?

While keywords have been part of the ONIX standard for many years, they definitely rose in importance around 2013. As publishers had whole backlists without keywords, obtaining coverage was (and still is) a resource-intensive task. In order to achieve high coverage of keywords across a catalog, many publishers undertook a stopgap approach, adding other metadata to keywords (from title/subtitle, subject codes, contributors, product format, and audience), which are already available to search providers, and therefore are unlikely to help with search visibility. To improve keyword quality and to recommend against practices such as keyword stuffing, the Book Industry Standard Group (BISG) published the “Best Practices for Keywords in Metadata,” in 2014, to guide publishers on choosing effective keywords.

Keyword quality is still low today, though. One example from Bowker shows the use of the keyword “audiobook” (relating to form, not content) in just about 12,000 of the approximately 114,000 titles sampled from 2015.

Do Online Retailers Use Keywords?

Every book search implementation is proprietary, so the exact use of keywords is generally not public knowledge. It is possible, however, to determine whether keywords, when used as search queries, return the books they’re associated with in ONIX.

Kadaxis tested 13 websites that consume ONIX and provide book search, and found that only Amazon showed books returned in search results for keywords attributed to the book in ONIX.

Keywords are central to Amazon’s search capability across all its product lines. The site receives keywords of wildly varying quality from a huge number of product suppliers (from individuals to large companies), which means its capability for filtering, cleaning and incorporating keywords into a search index and mapping these to consumer search queries is sophisticated.

As the quality of keywords provided by publishers is generally low, it is a challenging endeavor for other websites, without this history and experience, to use the data as extensively.

Are Keywords Worth the Investment?

From the research above, Amazon is the only online bookseller making use of keywords today. If increasing sales of books on Amazon is important, then investing in keywords may be worthwhile. As most publishers aren’t adding keywords to their titles (and of those that are, the quality is typically low), there also appears to be a window of opportunity in which publishers can gain a ranking advantage in Amazon’s search by adding keywords to titles.

Conclusion

While some publishers (see here and here) are quietly providing effective, consumer-oriented keywords, most aren’t investing significant resources. But doing so might represent a low cost, low risk investment for a potentially strong, recurring return. At least until a better solution is created, that takes the onus of keyword curation away from publishers and authors.

Additional thanks to Chris Saynor from OnixSuite.

This article originally appeared on the DBW blog March 4, 2016

What are off-page keywords?

September 30, 2015 Chris Sim

In the world of publishing metadata, when we talk about keywords, we’re talking about structured off-page keywords, often sent in an ONIX file, from a publisher to a retailer like Amazon. The retailer indexes the keywords and matches them against customer search queries, in order to display relevant books to them. Keywords are made up of phrases used to describe a book and their purpose is to give a search engine clues about how to show a book to consumers. We call them "off-page", because the retailer uses them directly, and doesn't show them to customers, like they do with other book metadata such as the title or description.

Web search engines, such as Google, determine what content such as a web page is about, and also how people might search for the content. Off-page keywords put this burden on publishers or authors, who have the complex task of trying to understand how readers might search, then how a search engine will use the provided keywords.

A typical book search engine, that reads ONIX, will index various metadata fields, like the title, author, categories and so forth, data who’s primary purpose is to inform consumers about the book - it’s public data. It needs to be appealing and be constructed in a way that is optimized for a search engine to work with.

Conversely, the primary purpose of off-page keywords is to directly inform a search engine how to match a book against search queries. The intended audience is a machine, and the data is hidden from consumers - it is "off" the product "page". This private nature gives publishers a lot of freedom to test and experiment.

Here's an official, dry, textbook definition of keywords in publishing:
“Keywords are words or phrases to describe the theme or content of a book. They are assigned by the metadata creator to supplement title, author, description or other consumer facing data.”

While accurate, it leaves out the motivation behind why we use keywords at all.

On the surface, keywords are just a metadata element. But used properly, they can be a powerful discovery mechanism to capture a reader’s experience with a book, in a way that facilitates sharing that experience with others.

Creating effective keywords is an exercise in studying reader psychology and linguistics, requiring empathy and insight into how people communicate about books with each other. If you’re able to think and talk like your audience, you’re more likely to reach them.

Keywords are used to sell all kinds of products online, but creating them is probably toughest for publishers, as books are far more complex and subjectively experienced than other products, like toothbrushes or hair dryers. So figuring out which elements to express can be challenging.

How do search engines use keywords?

Search engines are just computer programs written to find information for us. We type a query, and the engine thumbs through large swathes metadata to decide what books to display. The richer the metadata, the more search queries the book might match to.

A book with only basic metadata (title and author and so forth) will show up in fewer search results than the same book with 100 or even 50 good keywords. Every keyword you add is an opportunity to widen the search funnel, letting you suggest to the search engine another way consumers can find your book.

Most books are sold online, and most people find books through search (per Amazon). If you can improve a book’s visibility in search, you improve it’s likelihood of selling more copies. A recent study by Recode, a tech news website, found that more shoppers begin their product search on Amazon (55%) than Google (28%).

Do Writers Write What Readers Want To Read?

June 18, 2015 Chris Sim

View image | gettyimages.com

Have you ever wondered if the genres authors most enjoy writing in, match the genres readers most enjoy reading? Before self-publishing, all new books for sale were filtered by agents and publishers, who acquired and worked on books they thought would sell well. If there was an oversupply of manuscripts by authors in a particular genre, the competition to be chosen and published within the genre, would be higher too. Enter self-publishing: now any writer can publish, without filter, into any genre they desire. Given the influx of new books across genres, does the proportion of books in each genre meet with readers' demand?

(We focussed on fiction for this experiment).

Methodology (or How to Speed Read 3000 books in 3 hours)

View image | gettyimages.com

To answer our question, we needed a way to read and understand a good sized sample of self-published books, to determine their genre. You might ask why we couldn't simply use the categories or tags authors themselves apply to their books? The reason is accuracy and consistency - most indie authors don't have years of book categorization experience, working across a number of titles. Even traditionally published books are categorized inconsistently from book to book and from publisher to publisher. The inconsistency is not because publishers are poor at the job, but because standardizing the process would require centralizing the categorization effort. (We've worked with data feeds from all major publishers and have experienced this phenomenon first hand). The only way to derive accurate and consistent categorization is to read a large sample of books, understand how each book relates to each category, and assign it, while ensuring consistency across the sample. One of our systems does just this.

We gave our categorization system over 3000 self-published novels to read and understand (these were books offered free by the author). For each book, our system identified all the topics the novels were about, then used this topical knowledge to assign each book to one or more categories and genres. Overall, our system read over 260 million words and figured out all the genres, categories and topics in the data below, in a few hours.

What writers write

Writer’s Genres

The top genres (by count) detected by our system were Romance, Fantasy and Science Fiction.

Romance was the most popular genre, with 24.4% of books tagged. By combining Science Fiction and Fantasy though, to derive a total score of 32.1%, we can deduce that writers enjoy writing in this genre more than any other. Literary and Mystery & Detective both came in around 6%. How does this compare to what readers read?

Reader’s Genres

To understand the genres readers enjoy reading the most, we looked at revenue data. This doesn't incorporate units purchased or read, or ratings, but in aggregate, revenue is a good proxy indicator for reader enjoyment.

Source: Leading book genres worldwide as of January 2014, by revenue (in million U.S. dollars)

The highest selling fiction genres were Romance/Erotica, Crime/Mystery and Science Fiction and Fantasy. Romance was high in both charts, but we can broadly extrapolate that there’s a potentially underserved market for Crime and Mystery & Detective and an oversupply for Science Fiction and Fantasy books (when combining the two genres in our first chart).

The correlation isn’t perfect of course, as our sample size is small, we're not considering units sold vs. price, and the revenue data is based on the less consistent human classification of books. We also assume the novelists in our sample wrote their books for the joy of it, and didn’t select their categories purely for commercial potential. These points aside, for the purposes of this post, the proportional difference in genres across the two charts is interesting.

BISAC Categories

We also wanted to understand the categorical split of each genre, so we dove deeper and analyzed the individual BISAC categories that made up each genre. The chart below is measured by category composition - which analyzes how much of each book belongs to a category. For example, instead of tagging a book as Romance and Fantasy, our systems tell us the book is 30% Romance and 70% Fantasy.

(BISAC is the US publishing industry’s system for categorizing books. You can read all about it here - https://www.bisg.org/tutorial-and-faq)

This chart closely matches our genre chart, but tells us that Romance books typically consist of more granular categories than Science Fiction and Fantasy categories. This is somewhat reflected in the number of different BISAC subcategories for the genres - Romance has almost 50% more sub-categories than Science Fiction and Fantasy combined. It also alludes to a level of variance in the categories - our system was more easily able to split Romance titles into clearly distinct categories, but for Sci Fi and Fantasy, most content was generalized to Fantasy / General or Science Fiction / General.

(We've also classified tens of thousands of freely available Gutenberg books, which you can browse here. Many of these books were published before BISAC was invented.)

Topics

Next, we dove even deeper to look at the topic composition of our sample of books, and analyzed how much each book was made up of each topic. The topics listed below aren’t industry standard, and were created by our team. Topics allow us to quickly and programmatically understand, at a more granular level, what a book is about.

Given the strong bias for Science Fiction and Fantasy, the top few topics aren't particularly surprising. One observation we can make from this data, is that some genres have a higher proportion of genre-specific content than others. For example, a Romance novel will have many romantic scenes and dialogue, and be romance-themed. But the story will often revolve around another topic (western, military, etc.). A Science Fiction or Fantasy novel will usually contain a high proportion of genre-specific content - the whole world of the story will usually relate to the genre. Books in these genres are also likely to encompass elements of other genres too. Therein lies the categorization challenge we discussed earlier - should a novel be FIC027130 (Romance / Science Fiction) or FIC028000 (Science Fiction / General) or both? Are the romance elements strong enough for a book to be categorized as a 'Romance' book? Publishers of course, use knowledge of the book as well as strategic category selection, to influence placement of their books on bookstore bookshelves.

A few notes on the topics above. 'Existence' - covers concepts such as consciousness, the universe, humanity and realms - elements often found in Sci Fi / Fantasy. 'Vampires' have their own topic (instead of being part of 'Creatures & Monsters') which reflects the more prominent showing of vampires compared to other monsters, in recent fiction. 'Erotica' as a topic is smaller in representation for the reasons we discussed above for 'Romance'.

Conclusion

View image | gettyimages.com

We speculate that writers write more Sci Fi and Fantasy books, as it's simply a lot of fun to create entire worlds with their own rules, creatures and customs. Mystery & Detective or Crime novels, while also fun to write, are often set in our reality, and typically require some technical or specialized knowledge - details which may need to be fact checked and accurate. Many authors in these genres have had prior experience in the field, or have spent significant effort researching their topics. These books will often teach the reader something, which is appealing to readers.

As a writer, should you switch to Crime and Mystery in order to increase your odds of landing an agent or selling more self-published books? We don't think so. Write in the genre that is the best fit for you, as doing so will be reflected in your published work.

We hope you enjoyed this glimpse into what self-published authors are writing. Please let us know how you interpreted our results in the comments below.

If you'd like to see this data for your book, analyze your manuscript at Author Checkpoint.

A Publisher's Advantage Over Indie Authors

May 23, 2015 Chris Sim

View image | gettyimages.com

When it comes to book discovery and retail search, traditional publishers have two advantages over indie publishers.

More Keywords

The first is the ability to add more keywords to a book. Most independent authors will be able to add 5-7 keywords to their book's metadata. Each keyword (or phrase) provides an opportunity for the book to be matched with more search queries. The more search queries a book matches, the more times it will show up in search results, which of course means an increase in the potential customers who will see the book.

How is this possible?

Each online retailer (such as Amazon), accepts book metadata through different channels, and processes it for the search engine to use. Most independent authors add their data through a website (such as http://kdp.amazon.com). These websites are coded with specific rules about what data can be added and restrict, for example, how many keywords can be entered. (This is necessary to ensure a minimum level of quality in the data, which can impact search results).

Enter ONIX

Publishers, on the other hand, typically send book metadata to retailers in bulk, using an industry file format called ONIX. Under the ONIX standard, the keyword field has no restriction on the number of keywords that can be added to a book.

View image | gettyimages.com

Practically speaking, adding a very large number of keywords will lead to limited discovery benefit, as each online retailer will parse and process a maximum number of keywords.

But retailers almost certainly accept more than 7 keywords. The ONIX standard recommends filling the keyword field with 250 characters. The BISG working group, dedicated to book keyword best practices, however, recommends using 500 characters. This working group comprises members from Amazon and Barnes & Noble, the consumers of this book metadata, who use it in their search engines (we covered this group in our last post). Given this recommendation, and the contributors behind it, it's unlikely the book retailers would restrict keywords to less than 500 characters.

So how does this compare to the 5-7 keywords self-published authors are allowed? 500 characters equates to approximately 80 words, which is at least 27 keywords (phrases of 3 words in length - or more if keywords of 2 words are included). This is almost 4 times as many keywords.

Isn't having too many keywords bad for SEO?

Retail search engines process keywords from metadata in a structured manner (as opposed to web search engines that extract keywords content like web pages), and are unlikely to be subject to keyword dilution, which is the idea that using more keywords reduces an individual keyword's value. Keyword dilution can be a problem for a web page, as the breadth of topics a web page covers is likely to be less than an entire book.

It makes sense to use many keywords to describe the many topics in a book.

Does all this equate to a discovery advantage?

It is highly likely. If you use one keyword, your book will be matched to related search queries about that one topic. If you use 20 or more keywords, your book has 20 opportunities to be matched against different types of search queries, therefore significantly increasing the number of customers likely to see the associated book.

Better Search Data

The second advantage for publishers, who publish on Amazon, is data to help with Amazon SEO. Publishers with a high enough sales volume, will be invited to apply for access to ARA (Amazon Retail Analytics) which provides insight into search queries used by customers on Amazon. Core to any effective SEO strategy is the ability to evaluate and assess different keywords (search terms) for search volume (this is what Google Analytics provides free). ARA provides this data, albeit in a slightly obfuscated value called 'Search Frequency', along with data about conversions. Access to this data improves the efficiency and accuracy of keyword selection, as it allows publishers to determine whether to apply a long or short tail keyword to a book (depending on it's sales rank), and also to assess the type of books that convert the best for each search term.

View image | gettyimages.com

We can speculate why these differences exist, which are likely rooted in the history of traditional publishing and of self-publishing. In the early days of self-publishing, independent authors were much less sophisticated than they are today, as were the service providers to help them publish. It's likely disparities such as these, that provide considerable advantage to traditional publishers, will become less pronounced as the self-publishing industry matures.

Why Keywords Are So Important

May 21, 2015 Chris Sim

View image | gettyimages.com

Crafting effective keywords to add to a book's metadata, could be one of the highest return marketing activities to increase online sales potential. This post examines why keywords are so important, and how they affect discovery on Amazon.

Let's break the logic down:
• Amazon is the biggest bookseller in the world.
• Around two thirds of online book sales are made through Amazon.
• Search is how most customers find products on Amazon.
• Keywords directly influence a book's visibility in Amazon's product search.

In Amazon's own words (link requires a seller central login):

Search is the primary way that customers use to locate products on Amazon. Customers search by entering keywords, which are matched against the search terms you enter for a product. Well-chosen search terms increase a product's visibility and sales. The number of views for a product detail page can increase significantly by adding just one additional search term—if it's a relevant and compelling term.

We differentiate between keywords derived from web page text (Google, Bing, etc.) and keywords added to a book's metadata for consumption by a book retailer (Amazon, Barnes & Noble). Web search engines crawl web pages to derive keywords and concepts, to help users find information. Book product search engines consume book metadata (which includes keywords), provided by the publisher or author, and help customers find books to purchase.

As Google's executive chairman, Eric Schmidt lamented:

People don’t think of Amazon as search, but if you are looking for something to buy, you are more often than not looking for it on Amazon.

Why can't the machines just figure it all out?

So why the difference between a web and a book (product) search engine? Why can't Amazon read a book's text to figure out what to index, just like Google crawls a web page? There are two core reasons for this:

1. Human classification beats machine classification, when done properly. People are better at describing books, in terms other people relate to, than machines. The technology exists to understand the topical content of a book (we know, we've built it), but for a product search engine, it's more effective for Amazon to put the burden of describing a book in keywords, onto the author or publisher. The author/publisher, in turn, has a strong incentive to increase their book's discoverability in search.

2. It's easier for Amazon to do. Pretend you're a technical superstar tasked with building a search engine for millions of books. What solution do you think would be easier to build? One where you had to index 5-20 human curated keywords that describe each book, or one where you had to index tens (or hundreds) of thousands of words per book to find out what it's about? Leveraging an incentivised crowd to manually add descriptive terms in a structured format, is a much smarter and technically simpler solution.

Isn't it a search engine, not a discovery engine?

View image | gettyimages.com

It has been said that search is not discovery, but this perspective doesn't consider the complex task search engine's undertake to discern user intent (we've talked about the different user intents when searching before). Let's look at the distinction between book discovery and book search (within the context of a search engine), and how different elements of metadata support different user intents:

Book Search

Searching for a specific book or title supports a customer who has 'discovered' a book through another channel, and is simply visiting a book retailer to purchase the book. In this case, the user intent is obvious, and the implementation is a basic, nuts and bolts 'search' engine. As a publisher or author, you really don't have much to do to optimize for this use case. Your book title and author (contributor) name is specified in the metadata. The engine performs a simple match for these fields to a customer's search query. This is why there is no need to include book title and author name in your keywords.

Book Discovery

Book discovery, in the context of a 'search' engine supports many cases of different user intent, where a customer isn't searching for a specific book. The engine helps the customer discover books that satisfy their query. For example, customers might use a book search engine to discover:
• a new book to read in their favorite genre ('contemporary romance new releases')
• a book to learn about a trending topic ('books about the islamic state')
• a book to solve a problem ('back pain')

The metadata that directly influences book discovery on Amazon search are keywords.

Cases exist where subtitles and category names impact discovery, but keywords are designed for, and have a direct relation to book discovery. Other discovery mechanisms also exist, of course, such as bestseller lists and item-to-item similarity recommendations, but these are often outside of the control of an author/publisher.

Codifying how customers think about books

View image | gettyimages.com

Amazon categories are influenced by the way customers naturally group books together, and how they express these categorizations when searching for books. Book categories are continually refined to adapt to shifts in customers' tastes and collective interests. Books are categorized by manually curated metadata (BISAC or Browse Node - Amazon's equivalent of a category), as well as by analyzing a book's keywords. Many categories need a book to be associated with certain keywords, in order for it to qualify for the category. Analyzing the Science, Fiction and Fantasy category requirements we'll see keywords such as: angels, demons, dragons, vampire, aliens, horror and magic. These are all broad, book discovery terms that are designed to satisfy users looking to find books by search terms other than title and author.

There is a clear link between how customers mentally label and group books, and how they express their intent when trying to find books. Amazon attempts to replicate this organization via it's search engine and associated categorical data. By using the language and terms customers actively use to search for books, it can more accurately answer book queries at scale.

The bulk of the complexity of a successful book search engine, lies not in basic title/author matching, but in deciphering a user's intent when broad terms are used for discovery. Helping a customer find and purchase a book when they're unsure of exactly what they're looking for, is big business.

After all, the 'search experience team' believe it's about "finding, not searching".

Do readers even discover new books through search?

Unless you have access to internal search query and purchase data from a major online retailer, it's not possible to make an absolute assertion one way or the other. So let's consider some visible signals:

The industry believes so

The BISG has created a working group dedicated purely to defining best practices for keywords in book metadata. These keywords (in almost all cases) are curated by a person, to be stored with the rest of the book's metadata, and used by retailers (such as Amazon and Barnes & Noble) to help consumers find books. These are not the keywords that web search engines, such as Google, extract from the content of book descriptions on product pages.

This working group has published a guide for publishers to use when defining keywords, which is available for download (via free registration). A summary, that doesn't require registration, is also available.

The group comprises members from all the publishing service provider heavyweights (Ingram, Bowker, etc.), all big five publishers (plus many others), Library of Congress, Barnes & Noble and also Amazon.

Most large publishers have also allocated in-house resources (of varying expertise) specifically to curating keywords for their books.

Amazon has invested heavily in Search and Sales Business Intelligence

Access to this data is only available to a small number of organizations that sell a lot online, through a product called Amazon Retail Analytics (ARA). It's goal is to help vendors optimize their product listings to sell more, largely through data optimization for search. Here's a screenshot.

ARA provides publishers with data on how often keywords are searched for (volume), click through rates and conversion rates. It has it's limits, but is far more information than most smaller publishers and independent authors have access to.

When considering the investment and focus the publishing industry has dedicated to keywords, which are created for the sole purpose of helping consumers find books - it's challenging to dismiss the vital role they perform in selling books online.

A sales panacea?

Will the perfect keywords alone magically whisk a book to the bestsellers list? No. The fundamentals need to be executed well, which results in a quality, professional product with market demand. Quality can't be faked over the long term, and short term hacks won't lead to sustainable ongoing sales.

Effective keywords increase a book's chance of being located by the right customer, and help augment success achieved through other marketing channels. While keywords can increase a book's exposure, whether a customer discovers a quality book or not, will ultimately be represented by unit sales and reviews.

Conclusion

We've analyzed how keywords work and why they're important - which is to help sell more books in the marketplace where most books are sold. The industry acknowledges the importance of this correlation, as evidenced by its focus and investment in keyword standardization and dedication of resources (at publisher and retailer level). Yet most authors and publishers don't create effective keywords for their books or update them very often. Compared to the effort and resources involved in publishing a title, a well-implemented keyword strategy can be one of the highest ROI marketing activities for a book. In many cases, this represents a strong, currently missed, opportunity for increased book discovery.

Sign-up for Author Checkpoint and find keywords for any book.

How Book Categories Have Changed This Year

May 15, 2015 Chris Sim

Amazon continually updates it's browse node categories for books, to cater to the shifting needs of the market. In the last six months, 563 new categories were added, and 122 were removed. There were also a number of category name refinements. Categories serve the purpose of helping readers find similar books. As the number of books allocated to a category fluctuates, the granularity of the categories needs to change too. If categories remained static, they'd become unbalanced, with too few or too many books, which would make browsing and searching for books a challenge.

We've summarized the changes to book categories (browse nodes), that have occurred over the past six months, looking at the impact on various top level categories.

Genres with new categories added

There were 345 new 'Teen & Young Adult' categories added in the past six months, which is likely a reflection of the huge increase in YA sales over the past year.

Only one 'New Adult' category was added (Science Fiction & Fantasy/Fantasy/New Adult & College), to take the total to two categories (the other is: Romance/New Adult & College). Rather than expand this newer category, the breadth of 'Teen & Young Adult' has been expanded to accommodate the influx of titles in this area.

Additions to 'Computers & Technology' were dominated by 'Software' (Adobe, Enterprise Applications and Business) and 'Web Development & Design' (Programming and Web Design) sub-categories.

The largest increase to the 'Religion & Spirituality' genre was in the 'Religious Studies' sub-category, comprising 24 of the 52 additions.

Teen & Young Adult Category Additions

Digging deeper into the 'Teen & Young Adult' genre we see that of the 345 additions made, 152 were in fiction and 193 were in non-fiction. These were broken up as follows:

A full list of new categories is available here.

Genres with categories removed

The 'Computers & Technology' genre had the most categories removed (88), but had an almost equal number of categories added (87). Technology experiences rapid changes, so a commensurate shift in categorization of the subject matter is likely to occur.

The second largest genre with categories removed was 'Crafts, Hobbies & Home', and within that genre, almost half were in the 'Home Improvement & Design' sub-category.

For a full list of the removed categories, click here.

Conclusion

Changes in browse node categories reflect shifts in the type of books available for sale. An increase in categories for a genre is likely driven by a combination of additional supply of books in the genre, and of increased effort to improve the searchability of those books. One may speculate how these two factors reflect increased demand for books in a genre.

How do Amazon and Google use my book metadata in search?

May 11, 2015 Chris Sim

This post explains some basic concepts of how search engines work and index your book metadata, and the differences between Amazon and Google search engines.

9 Common Keyword Mistakes

February 3, 2015 Chris Sim

Coming up with relevant and effective keywords is hard! Keeping them up-to-date and optimized for then number of sales your book is currently making is even harder. Here are common mistakes we see authors make when implementing their keyword strategy on Amazon:

1. Choosing keywords that are too broad

2. Not validating that a keyword is commonly used by customers

3. Choosing keywords without much traffic

4. Not monitoring keyword progress (checking search rank for a book)

5. Leaving keywords unchanged for a month or longer

6. Choosing keywords that are too competitive for their book

7. Repeating terms across keywords

8. Not aligning keyword strategy with external marketing activities (to capitalize on sales rank increases)

9. Not having a keyword strategy!

How Copyediting Could Be Disrupted

February 28, 2014 Chris Sim

A human copyeditor is unlikely to be completely displaced by a machine, but a significant portion of common copy edit’s to manuscripts could be automated. A primitive tool to assist with copyediting exists (AutoCrit) which suggests changes to text based on readability and other metrics. An advanced tool could be created to capture micro edits across multiple manuscripts, compare these edits and then automatically apply the changes where confidence in the change is high.

Publishing houses are best placed to create these specialized copyediting knowledge bases. They could start by installing software on editors' machines to capture each line-edit and log it to a central database. A copyediting rules engine would then analyze the before and after text changes using part-of-speech (POS) tagging to disambiguate word-categories. After collecting enough examples of similar edits, a rule could be learned by the system, and applied to similar occurrences in new text. These rules would be saved as templates that understand POS tagging. A rules-based library already exists that could easily be adapted to support this system.

The new copyedit system will undoubtedly suggest suboptimal changes, or multiple text alternatives. In this scenario, a human would verify the change. The system would learn which changes were preferable, under which circumstances, until it has enough knowledge and confidence to apply edits automatically. The review process could be extended to include feedback from book reviewers, to rate the most effective changes.

It's unlikely the system could turn good writing into great writing, but at the very least, it could learn enough Strunk-like style suggestions to improve poor writing, via rule based templates, for example to ‘use the active voice', 'omit needless words' and to 'put statements in positive form'.

Slush Filter beta version released!

February 3, 2014 Chris Sim

Kadaxis is pleased to announce the beta release of Slush Filter, a tool for literary agents and publishers. Slush Filter accepts fiction manuscripts of 40,000 words or more, in doc, docx, txt and ePub formats, and provides a machine generated report in seconds. Each report contains:

A recommendation on whether to review the manuscript (based on potential marketability)
BISAC Codes
Comp Titles
Locations and character names.

Please email info@kadaxis.com for an unlimited trial license.

50 Shades of Grey meets Project Gutenberg

August 30, 2013 Chris Sim

If you were a fan of 50 Shades of Grey, you might be interested in these somewhat older, classic tales that our algorithms classified as Fiction / Erotica, purely by analyzing the content. All titles from Project Gutenberg are available as a free download to your eReader or to browse online. Check them out:

Erotic books from Project Gutenberg

Or... Browse all titles we've classified