Big Data is a field that deals with large and complex datasets that may be in structured or unstructured form. These are the top 10 big data books that will help you learn about Big Data.
By Alex Gorelik
The author of this book is a forefather of waterline data, guides the finest practices and role performers across the industries related to a big data lake. It explains how to start and grow a data lake, data puddles, and data ponds, how different architectures of Data Lake offer opportunities and obstacles in virtual or cloud-based environments. Talks about the importance of self-service, how to comprehend, discover and understand data on their own with expert level skills. In short, to get the right platform, with the right data and organizing leads towards self-service and interface based upon proper needs helps to generate/craft an effective data lake.
2. Big Data in Practice: How 45 Successful Companies Used Big Data Analytics to Deliver Extraordinary Results
By Bernard Marr
Bernard Marr is a top business writer, keynote speaker, and provides his consultancy services to big data analytics and enterprise performance. This book is for new learners and experienced people at once, who seek understanding of the intricate world of big data with real-life case studies of top 45 international and highly successful companies ( which includes Walmart, CERN, Netflix, Rolls-Royce, Shell, Apixio, Lotus F1 Team, Pendleton & Son Butchers, US Olympic Women’s Cycling Team, ZSL, Facebook, John Deere, Royal Bank of Scotland, LinkedIn, Microsoft, Acxiom, US Immigration and Customs, Nest, GE, Etsy, Narrative Science, BBC, Milton Keynes, Palantir, Airbnb, Sprint, Dickey’s Barbecue Pit, Caesars, Fitbit, Ralph Lauren, Zynga, Autodesk, Walt Disney Parks, and Resorts, Experian, Transport for London, The US Government, IBM Watson, Google, Terra Seismic, Apple, Twitter, Uber, Electronic Arts, Kaggle, Amazon), their useful strategies and tips.
By EMC Education Services
This book talks about the tools and techniques of big data analytics. Its content is specifically good for beginners, provides assistance for stakeholders, database experts, managers of big data groups to deepen their analytical abilities, students for exploring it as a career. Its content consists of twelve chapters. The first half explained about introduction to data science and big data analytics, ways to analyze project lifecycle with big data, statistical procedures for open-source R, how to analyze exploratory data by visualizing, association rules, and regression techniques. The second half gives insight into classification, clustering of data, time series, specific technologies, advanced tools, text analysis, and how to operate big data.
By Bill Chambers, Matei Zaharia
According to the Author, big data create qualitative change quantitatively. This book is written for data scientists and data engineers who are seeking to use Apache Spark, with a basic background in machine learning. It focuses on Spark APIs in new generation 2.0, presenting comprehensive knowledge about Apache Spark with easy-to-run examples. It provides deep insights related to high-level structured APIs (which includes spark SQL, data frames, data sets, structured streaming, etc). It enables the readers to write modern Apache Spark applications by using different case studies related to storing, analyzing, and exploiting data (Snowden Affair, Data Security, etc).
By Alan Nugent, Fern Halper, Judith S. Hurwitz, Marcia Kaufman
This book entails the importance of big data, as it is becoming the top trending technology that has the potential to bring revolutionary change. Overall this book has seven main parts, first part talks about a detailed understanding of big data with a technical and business viewpoint, second part tells the technical foundations of big data for business experts to understand infrastructure in depth. Third and fourth part talks about analytics and management of big data. And the remaining fifth sixth and seventh part explains how to get solutions to big data problems in the real world to get a peek into the future.
By Viktor Mayer-Schönberger, Kenneth Cukier
The focus of this book is how big data has become the dominant scientific paradigm and has caused a paradigm shift. It is divided into ten chapters, the first four of which discuss three major developments in detail: More, Messy, and Correlations.
The following four chapters explain its societal and economic implications, primary and secondary applications, and privacy protection (dark side of big data). Technically, legally, and traditionally, managing the data mechanism is very challenging.
This book is a little inflated and discusses such themes that have already been covered. The final two chapters attempt to propose a solution to one of the big data’s most difficult concerns (privacy protection) while also revealing its limitations.
By Martin Kleppmann
The author of this book is a researcher, software engineer, blogger, entrepreneur at e-companies, conference speaker, and open source contributor. In this book, he talks about a deep understanding of technical ideas that will further help to develop an improved version of the software. He talks about the right choices for the application from different tools which include relational databases, NoSQL data-stores, steam or batch processors, and message brokers. This book provides a detailed overview of, advantages and disadvantages of numerous processing and storing technologies. This book helps software engineers and architects to practically implement ideas and their use in modern applications.
By Nathan Marz, James Warren
The book consists of three parts (Batch layer, Serving layer, and Speed layer) which are further comprised of eighteen chapters. Chapter one provides the general idea of Lambda Architecture. Chapters two to nine talk about batch layer Lambda Architecture, which helps to learn the modeling of a dataset. Chapters ten and eleven explain the Serving Layer, which helps in learning special databases in bulk form. Chapters twelve to eighteen focuses on the Speed Layer, which helps to deliver advanced results of all problems. It also discusses in-detail NoSQL Databases, stream processing, and how to cope with the complexities of incremental computations.
By Thomas Erl, Wajid Khattak, Paul Buhler
This book is one of the best-selling articles due to its easy-to-understand, comprehensive and up-to-date approach related to fundamental concepts of big data, theory, and techniques. It explains data concepts clearly with simple and easy case studies and their related diagrams and links with business practice. The author also explains that how the organization can be groom onward by resolving previous obdurate business problems. It talks about big data initiatives, big data adoption, 5 “V” characteristics of data sets, data warehouses, big data leverage, use of NoSQL, and different formats of data sets ( structured, semi-structured, unstructured, and metadata formats).
By Dawn E. Holmes
The author has done specialization in machine learning and data mining. According to Dawn E. Holmes, big data create qualitative change quantitatively. This book seems for beginners, as it focuses on new technologies with the help of case studies that how data can be stored, scrutinized, and misused by different big companies. It talks about how to transform business operations and raise ethical issues related to privacy protection. It also highlights the concept of machine learning, hacking, and security attacks specifically related to big data. This book is a little disordered concerning the usage of technical terms and broad concepts.