Computing Reviews, the leading online review service for computing literature.

Search

Mathematics of data science: a computational approach to clustering and classification
Calvetti D., Somersalo E., SIAM, Philadelphia, PA, 2020. 189 pp. Type: Book (978-1-611976-36-6)

Date Reviewed: Oct 7 2021

Mathematics is the foundation of data science techniques. With the democratization of data science, almost anyone has access to easy-to-use tools and platforms to get started with data science applications. However, a serious professional or researcher would sooner or later need to understand the mathematics behind the wide variety of data science methods. To this end, this book provides a lightweight mathematics background for common machine learning models and techniques. The book starts with a refresher in linear algebra and a concise overview of basic mathematics terms and operations, such as vectors, matrices, and eigenvalues and eigenvectors. However, the real meat of the book starts with chapter 2, where the authors introduce principal component analysis (PCA). In a later chapter, the authors cover other well-known and commonly used techniques such as k-means, classification algorithms, and tree-based classifiers. The authors first provide a brief description of the technique, to establish the need for it, and then delve into the mathematics of the concept. The authors typically include brief examples to show the transformations and applicable operations; these examples are welcome additions to the book. The provided figures and tables are clean; in general, they help readers get a better understanding of the explained concept. The book assumes a basic understanding of mathematics notations and operations, such as summation, and does a good job of raising the bar on the mathematics knowledge associated with machine learning models. However, the authors fail to include a much-desired piece: a high-level logical explanation of data science operations without using any Greek symbols.

Reviewer: Tushar Sharma	Review #: CR147369

Clustering (H.3.3 ... )

General (G.0 )

Would you recommend this review?

yes

Other reviews under "Clustering":	Date

Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases Can F. (ed), Ozkarahan E. ACM Transactions on Database Systems 15(3): 483-517, 1990. Type: Article	Dec 1 1992

A parallel algorithm for record clustering Omiecinski E., Scheuermann P. ACM Transactions on Database Systems 15(3): 599-624, 1990. Type: Article	Nov 1 1992

Organization of clustered files for consecutive retrieval Deogun J., Raghavan V., Tsou T. ACM Transactions on Database Systems 9(4): 646-671, 1984. Type: Article	Jun 1 1985

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy