Object Oriented Data Analysis

Description: Object Oriented Data Analysis is a framework that facilitates inter-disciplinary research through new terminology for discussing the often many possible approaches to the analysis of complex data. Such data are naturally arising in a wide variety of areas. This book aims to provide ways of thinking that enable the making of sensible choices. The main points are illustrated with many real data examples, based on the authors’ personal experiences, which h...

Beginning Apache Spark 3, 2nd Edition

Description: Take a journey toward discovering, learning, and using Apache Spark 3.0. In this book, you will gain expertise on the powerful and efficient distributed data processing engine inside of Apache Spark; its user-friendly, comprehensive, and flexible programming model for processing data in batch and streaming; and the scalable machine learning algorithms and practical utilities to build machine learning applications. Beginning Apache Spark 3 begins by expl...

Communicating with Data: Making Your Case With Data

Description: Data is a fantastic raw resource for powering change in an organization, but all too often the people working in those organizations don’t have the necessary skills to communicate with data effectively. With this practical book, subject matter experts will learn ways to develop strong, persuasive points when presenting data to different groups in their organizations. Author Carl Allchin shows anyone how to find data sources and develop data analytics, ...

A History of Data Visualization and Graphic Communication

Description: A comprehensive history of data visualization―its origins, rise, and effects on the ways we think about and solve problems. With complex information everywhere, graphics have become indispensable to our daily lives. Navigation apps show real-time, interactive traffic data. A color-coded map of exit polls details election balloting down to the county level. Charts communicate stock market trends, government spending, and the dangers of epidemics. A His...

The Data Warehouse Toolkit, 3rd Edition

Description: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling Updated new edition of Ralph Kimball’s groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball’s The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated ...


内容简介: 大数据和机器智能的出现,对我们的技术发展、商业和社会都会产生重大的影响。作者吴军在《智能时代:大数据与智能革命重新定义未来》中指出,首先,我们在过去认为非常难以解决的问题,会因为大数据和机器智能的使用而迎刃而解,比如解决癌症个性化治疗的难题。同时,大数据和机器智能还会彻底改变未来的商业模式,很多传统的行业都将采用智能技术实现升级换代,同时改变原有的商业模式。大数据和机器智能对于未来社会的影响是全方位的。 作者简介: 吴军,博士,与2002...

Data Pipelines Pocket Reference

Book Description: Data pipelines are the foundation for success in data analytics and machine learning. Moving data from many diverse sources and processing it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack. You'll learn common considerations and key decision points when implementing pipelines, such as data pipeline...

Data Lake Analytics on Microsoft Azure

Book Description: Get a 360-degree view of how the journey of data analytics solutions has evolved from monolithic data stores and enterprise data warehouses to data lakes and modern data warehouses. You will This book includes comprehensive coverage of how: To architect data lake analytics solutions by choosing suitable technologies available on Microsoft Azure The advent of microservices applications covering ecommerce or modern solutions built on Io...

Advanced Analytics in Power BI with R and Python

Book Description: This easy-to-follow guide provides R and Python recipes to help you learn and apply the top languages in the field of data analytics to your work in Microsoft Power BI. Data analytics expert and author Ryan Wade shows you how to use R and Python to perform tasks that are extremely hard, if not impossible, to do using native Power BI tools. For example, you will learn to score Power BI data using custom data science models and powerful models from Microso...

Creating Good Data

Book Description: Create good data from the start, rather than fixing it after it is collected. By following the guidelines in this book, you will be able to conduct more effective analyses and produce timely presentations of research data. Data analysts are often presented with datasets for exploration and study that are poorly designed, leading to difficulties in interpretation and to delays in producing meaningful results. Much data analytics training focuses on how t...

利用Python进行数据分析 原书第2版

内容简介: 本书由Python pandas项目创始人Wes McKinney亲笔撰写,详细介绍利用Python进行操作、处理、清洗和规整数据等方面的具体细节和基本要点。第2版针对Python 3.6进行全面修订和更新,涵盖新版的pandas、NumPy、IPython和Jupyter,并增加大量实际案例,可以帮助你高效解决一系列数据分析问题。 第2版中的主要更新包括: • 所有的代码,包括把Python的教程更新到了Python 3.6版本(第1版中使用的是Python 2.7) • 更新了Python第三方发布版Anaconda和其他所需Python包的安装指引 • 更新pa...


内容简介: 本书以类似“章回小说”的活泼形式,生动地向读者展现优秀的数据分析人员应知应会的技术:数据分析基本步骤、实验方法、最优化方法、假设检验方法、贝叶斯统计方法、主观概率法、启发法、直方图法、回归法、误差处理、相关数据库、数据整理技巧;正文以后,意犹未尽地以三篇附录介绍数据分析十大要务、R工具及ToolPak工具,在充分展现目标知识以外,为读者搭建了走向深入研究的桥梁。 本书构思跌宕起伏,行文妙趣横生,无论是职场老手,还是业界新人,无论是字斟句酌,还是信手翻阅,相信都...