Practical Natural Language Processing

Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana

Practical Natural Language Processing is your guide to build, iterate and scale NLP systems in a business setting and to tailor them for various industry verticals.

Consider the task of building a chatbot or text classification system at your organization. In the beginning there may be little or no data to work with. At this point a basic solution using rule based systems or traditional machine learning will be apt. As you accumulate more data, more sophisticated ML techniques (which are often data intensive) can be used including deep learning. At each step of this journey there are dozens of alternative approaches one can take. This book will help you navigate this maze of options.

You will also learn to adapt your solutions for different industry verticals like law, finance and retail. Moreover, you will learn about specific caveats you will encounter in each. You will learn how to analyze and summarize legal documents, provide automated customer support, extract attributes from e-commerce products and much more.

The authors hail from Carnegie Mellon, UC San Diego (current NLP Ph.D. student), U of Tübingen, and the Indian Institutes of Technology. They have built and deployed NLP and ML systems in both, academia and industry, including Fortune 100 companies, Silicon Valley startups, the MIT Media Lab, Microsoft Research and Google AI. They have also taught NLP courses at US universities as an Assistant Professor and published dozens of research papers in the field with hundreds of citations.

The book distills the authors’ collective wisdom for building and iterating NLP systems. The book content is also being advised by researchers and scientists from some of the top universities and technology giants in the world.

Table of Contents

Why we wrote this Book

There are a range of widely popular books on NLP in the market. While some of them serve the purpose of being used as textbooks in universities focusing on theoretical aspects, some others aim to introduce NLP concepts through a lot of code examples. There are a few others which focus on specific NLP or Machine learning libraries and provide “How To” guides on solving different NLP problems using the libraries. So, why another book on NLP?

We have been building and scaling NLP solutions for over a decade at leading universities and technology companies. While mentoring colleagues and other engineers, we noticed a gap between NLP practice in the industry and the NLP skill set of new engineers or those who are experienced but just starting with NLP in particular. We started understanding these gaps even better with NLP workshops we were conducting for industry professionals.

Most of the online courses and books tackle NLP problems using toy use cases and popular (often large, clean and well defined) datasets. While this teaches the readers general methods, we believe it does not give enough foundation to tackle new problems and develop specific solutions in the real world. Commonly encountered problems while building real world applications such as data collection, working with noisy data and signals, incremental development of solutions, and issues involved in deploying the solutions as a part of a larger application are not dealt with by existing materials on the topic, to our knowledge. We felt a book was needed to bridge this gap, and that is how this book was born!

Book Structure

More about the Book

The book aims to give the reader a quick overview, followed by in-depth knowledge and theoretical background.

For a better dissemination of knowledge, we have structured each chapter into various sections. In ‘Essentials’ (Section 2), each chapter begins with a background, history and applications. This is followed by a basic algorithm as part of first code walkthrough. We then delve into theoretical foundations behind it and going into more sophisticated algorithms and models. We wrap it up with a glimpse of cutting edge techniques and results. For instance, in the chapter on Text Classification, we begin with Naive Bayes as the first baseline. Continue improving the solution with algorithms like SVM and FastText. The 360-degree view is finally closed with practical tips and a glimpse of state-of-the-art methods like word and character level CNNs and RNNs.

As opposed to going deep and vertically as in Section 2, in ‘Applied’ (Section 3) we traverse a range of topics horizontally to facilitate a comprehensive understanding of how to leverage the knowledge obtained in earlier sections. For instance, in the chapter on E-commerce and Retail, we cover a range of problems starting including attribute extraction, aspect level review analysis, duplicate product detection, faceted search, and ranking, aspect identification in product descriptions, finding substitutes and complement items etc. Similarly, we also compare and contrast generic techniques and domain-specific techniques. For example, we highlight the features and implementation for an e-commerce search engine as opposed to a generic search engine.

The topics covered in this book have been motivated via surveys conducted at various technical workshops and panel discussions that the authors have been a part of. All the algorithms, techniques, datasets and technologies are supported with extensive references so that the reader can dig deep into the details. And throughout the book, we cover practical tips and best practices on building and deploying these models. Last but not the least, every chapter ends with a ‘cutting edge’ section where the state of the art is discussed.

The book will be around 350 pages. It will be accompanied by a code repository containing several Jupyter notebooks for all the chapters to give a walk-through and explain the code in detail. The code base is in Python and various machine learning and natural language processing libraries. The book assumes that the readers have a good grasp of programming but no theoretical and practical knowledge of NLP.

Commonly Asked Questions