How Copyediting Could Be Disrupted

A human copyeditor is unlikely to be completely displaced by a machine, but a significant portion of common copy edit’s to manuscripts could be automated. A primitive tool to assist with copyediting exists (AutoCrit) which suggests changes to text based on readability and other metrics. An advanced tool could be created to capture micro edits across multiple manuscripts, compare these edits and then automatically apply the changes where confidence in the change is high.

Publishing houses are best placed to create these specialized copyediting knowledge bases. They could start by installing software on editors' machines to capture each line-edit and log it to a central database. A copyediting rules engine would then analyze the before and after text changes using part-of-speech (POS) tagging to disambiguate word-categories. After collecting enough examples of similar edits, a rule could be learned by the system, and applied to similar occurrences in new text. These rules would be saved as templates that understand POS tagging. A rules-based library already exists that could easily be adapted to support this system.

The new copyedit system will undoubtedly suggest suboptimal changes, or multiple text alternatives. In this scenario, a human would verify the change. The system would learn which changes were preferable, under which circumstances, until it has enough knowledge and confidence to apply edits automatically. The review process could be extended to include feedback from book reviewers, to rate the most effective changes.

It's unlikely the system could turn good writing into great writing, but at the very least, it could learn enough Strunk-like style suggestions to improve poor writing, via rule based templates, for example to ‘use the active voice', 'omit needless words' and to 'put statements in positive form'.