Open the catalogue copy for any novel and you will read a description in a particular register. Setting, premise, comp titles, BISAC code, age range, awards, sometimes a one-line endorsement from another author. The description is competent. It is also written in the voice of the catalogue itself, which is the voice of someone selling books to other professionals: agents to editors, editors to sales reps, sales reps to retailer buyers. That voice is a B2B voice, and it does its job well. It just does its job for a specific reader, and that reader is not the consumer.

Diagram showing catalogue language and reader review language moving toward retailer search.
Catalogue language sells to the trade. Review language reveals how readers search.

Open the same novel's reviews on Amazon, Goodreads, or LibraryThing, and you will read about the same book in a completely different register. Readers do not describe books the way catalogues do. They describe how the book made them feel, what surprised them, which character undid them, how it compared to the last three books they read in the same vein, what they wanted more of and what they could have done without. This is not a more emotional version of the catalogue voice. It is a different vocabulary entirely.

The gap between these two registers is the gap between how a publisher describes a book and how a reader looks for one.

When a reader sits down to find their next book, they search in the second register. They do not type "literary fiction, women's, family saga." They type "messy mother daughter relationship that made me ugly cry" or "smart slow burn historical romance with a competent female lead" or "book like Beach Read but darker." The vocabulary of search is the vocabulary of experience and intent, not the vocabulary of cataloguing.

The retailer search engine, however, indexes whatever vocabulary the publisher supplies. If the publisher supplies catalogue language and the reader searches in experience language, the book is invisible at the moment of decision. The book is present in the catalogue. The book is missing from the search.

This is the case for using reader reviews as a primary source for keyword work. Reviews are a record, in volume, of how the actual buying audience for a book describes that book in its own voice. They are not a perfect record. They include outliers, off-topic complaints, plot summaries by readers who got the wrong book, and reviewers performing for an audience. But across a few hundred or few thousand reviews of a successful comparable title, a stable signal emerges: the words and phrases readers reliably use to describe what the experience of that book is like.

That signal is the closest available proxy for buying intent.

It is also a source of metadata that a publisher cannot generate from inside their own offices. A publisher describes a book using the language of the people who made it: editors, marketers, publicists. Those people are skilled describers, but they are not the audience. Their language is closer to the book and further from the search. The further upstream from the reader the description-writer sits, the larger the gap between what they write and what a reader is going to search for.

The objection we hear sometimes is that reviews are a noisy source. They are. So is every other large unstructured corpus that produces useful information. The work of using reviews well is the work of separating signal from noise: identifying which words and phrases are reliably tied to the experience of a specific book or its category, and which are random reviewer flourishes. Done at scale, this is solvable. Done one book at a time by hand, it is impractical, which is why doing it well requires a built methodology rather than a manual scan.

The practical implication for a publisher is concrete. The keyword strings, BISAC selections, and description copy you supply are competing with reader-supplied language for the attention of the retailer's search index. Whichever vocabulary is closer to the queries readers are actually running, wins. If your metadata vocabulary is upstream of the reader, which is the default for most catalogue work, you are losing that competition before it starts.

There is no fix for this on the catalogue side. Catalogues will continue to describe books in the language of the trade, because the trade is who reads them. The fix is on the metadata side: the keyword fields, search visibility work, and description copy that face the retailer rather than the trade. That work has to be written from the reader's side of the search, not the publisher's. Reviews are the closest available record of what the reader's side looks like.