58,84 €*
Versandkostenfrei per Post / DHL
Aktuell nicht verfügbar
This book includes comprehensive coverage of how:
To architect data lake analytics solutions by choosing suitable technologies available on Microsoft Azure
The advent of microservices applications covering ecommerce or modern solutions built on IoT and how real-time streaming data has completely disrupted this ecosystem
These data analytics solutions have been transformed from solely understanding the trends from historical data to building predictions by infusing machine learning technologies into the solutions
Data platform professionals who have been working on relational data stores, non-relational data stores, and big data technologies will find the content in this book useful. The book also can help you start your journey into the data engineer world as it provides an overview of advanced data analytics and touches on data science concepts and various artificial intelligence and machine learning technologies available on Microsoft Azure.
What Will You Learn
Architecture patterns of the modern data warehouse and advanced data analytics solutions
Phases¿such as Data Ingestion, Store, Prep and Train, and Model and Serve¿of data analytics solutions and technology choices available on Azure under each phase
In-depth coverage of real-time and batch mode data analytics solutions architecture
Various managed services available on Azure such as Synapse analytics, event hubs, Stream analytics, CosmosDB, and managed Hadoop services such as Databricks and HDInsight
This book includes comprehensive coverage of how:
To architect data lake analytics solutions by choosing suitable technologies available on Microsoft Azure
The advent of microservices applications covering ecommerce or modern solutions built on IoT and how real-time streaming data has completely disrupted this ecosystem
These data analytics solutions have been transformed from solely understanding the trends from historical data to building predictions by infusing machine learning technologies into the solutions
Data platform professionals who have been working on relational data stores, non-relational data stores, and big data technologies will find the content in this book useful. The book also can help you start your journey into the data engineer world as it provides an overview of advanced data analytics and touches on data science concepts and various artificial intelligence and machine learning technologies available on Microsoft Azure.
What Will You Learn
Architecture patterns of the modern data warehouse and advanced data analytics solutions
Phases¿such as Data Ingestion, Store, Prep and Train, and Model and Serve¿of data analytics solutions and technology choices available on Azure under each phase
In-depth coverage of real-time and batch mode data analytics solutions architecture
Various managed services available on Azure such as Synapse analytics, event hubs, Stream analytics, CosmosDB, and managed Hadoop services such as Databricks and HDInsight
You can connect with him on LinkedIn at [...]
Covers the life cycle of data, from building pipelines to data analytics and visualizations
Provides use cases for real-time and batch mode processing
Shows you how to infuse machine learning into real-time and batch mode data analytics pipelines
Chapter Goal: The chapter introduces the readers to the concept & need of a data lake in this big data environment.The chapter also covers how to create a data lake & architecture patterns to be followed for data lake analytics.
No of pages 15
Sub -Topics
1. Relational and non-relation data stores
2. Base for data: relational and non-relational databases
3. Warehouses of data: data warehouses
4. Markets for data: data marts
5. Introduction to data lake
6. Need to create a data lake
Chapter 2: Data Just Got BiggerChapter Goal: Today, enterprises have mix of relational and non-relational stores. However, when it comes to analyzing all this data - there must be a neutral platform which can understand these types of data. This introduces us to modern world concepts of distributed data storage & processing. It also talks about data sciences & machine learning concepts & how they are revolutionizing the data analysis world.
No of pages : 20
Sub - Topics:
1. Massively parallel processing, distributed data and spark the Hadoop
2. Distributed systems vs massively parallel processing systems (MPP)
3. Respective use cases for distributed and MPP systems
4. Science for data
5. Learning of machines
6. Overview of data analytics and advanced data analytics Chapter 3: Emergence of Cloud Lakes
Chapter Goal:The chapter enlighten the users with multiple cloud-based technologies available which are scalable, agile and performance in terms of computation, storage & analytics options. It goes into details about the suggested architecture on Microsoft Azure to solve Modern data warehouse, analytics use cases.
No of pages: 20
Sub - Topics:
1. Data travels to Cloud with added benefits
2. Overview of phases of data analytics architecture
3. Available products under each phase on Microsoft Azure
Chapter 4: Phases in Managing Data Analytics PipelineChapter Goal: This chapter covers in-depth context of this book. After we understand the background, this chapter will provide understanding of what are the phases of building entire data analytics pipeline. All the phases discussed in this book are critical to understand and any analytics solution will adhere to this common principle some way or the other. In each phase, there are different solutions to cater respective issues.It covers the data life cycle from upstream to downstream applications.
No of pages: 20
Sub - Topics:
1. Real time and batch mode data processing
2. Phases in data Management
· Ingest
· Store
· Analytics
· Visualization
3. Cloud data lake architecture patterns
Chapter 5: Data Ingestion in the LakeChapter Goal: The chapter talks about the limitations about the traditional storage & how the big data technologies has emerged as the champion in solving the limitations & changing the concepts of Extract, Transform & Load (ETL) to Extract, Load & Transform(ELT).
No of pages: 20
Sub - Topics:
1. Traditional limitations, can big data help?
2. ETL now becomes ELT
3. Tools in cloud for data ingestion
· Azure Data Factory on Microsoft Azure
· SQL server integration services on-premise
4. Overview of partner solutions for ETL/ELT - Informatica PowerEdge
Chapter 6: Data Storage & FarmingChapter Goal: The chapter shares with readers that how once the data is available in storage layers, how it can be grown & real time data storage & analysis needs can be catered, it also talks about batch & real time data processing & storage.
No of pages: 20
Sub - Topics:
1. Grow the data
2. Role of Azure data lake store, Blob, relational and non-relational stores
3. Architecting the Lambda & Kappa
4. Manage storage for real time and batch processing
Chapter 7: Analyzing the Bigger Data in Real TimeChapter Goal: Analysis of data is crucial for enterprises to get the business insights from the historic, present & future data to make descriptive, streaming & predictive analytics. In this chapter, we will specifically talk about real time analytics. Components required to perform real time analytics and how to optimize the cost using Azure PaaS solutions.
No of pages: 30
Sub - Topics:
1. Need of real time analytics
2. Approach to build data analytics on data lake for real time processing
3. Leverage event hubs/IOT hubs as a queuing solution on Azure
4. Why Edge computing and digital twins are gaining limelight
5. Choice between PaaS vs IaaS solution for streaming data processing
6. PaaS - stream analytics or spark streaming
7. Infuse R and Python on real-time data analytics pipelines
8. Use cases for real time analytics
Chapter 8: Analyzing the Bigger Data in Batch ModeChapter Goal: Analysis of data is crucial for enterprises to get the business insights from the historic, present & future data to make descriptive, streaming & predictive analytics. Analytics can help companies identify new business opportunities and revenue streams which results in an increase in profits, new customers, and improved customer service.
No of pages: 30
Sub - Topics:
9. Role of big data and massively parallel processing systems
10. Approach to build data analytics on data lake for batch processing
11. Approach to build data analytics solution for real time analytics
12. When to leverage HDInsight and Spark clusters
13. Infuse R and Python in data analytics pipelines
14. How it's different from conventional data warehousing and massively parallel processing solutions
15. Use cases for batch mode processing
Chapter 9: Visualization and Other Downstream ChoicesChapter Goal: Visualization of data is crucial for reporting& also to perform exploratory data analytics. The chapter talks about the visual elements like charts, graphs, and maps, data visualization tools which provide an accessible way to see and understand trends, outliers, and patterns in data
No of pages: 10
Sub - Topics:
1. Visualizations tools - Power BI
2. Downstream applications - LOB applications, notification applications
3. Choice of data stores for downstream applications - Cosmos DB, Azure SQL Database
Chapter 10: Summary of Data Lake components in AzureChapter Goal: The chapter takes a dig at multiple azure components which makes its easy to create an enterprise data lake in cloud & talks about in details the usage of each
No of pages: 20
Sub - Topics:
1. Azure data factory
2. Azure data lake storage
3. Azure HDInsight
4. Azure databricks
5. Azure data warehouse
6. Azure PowerBI
Chapter 11: Conclusion
Chapter Goal: The concluding chapter summarizes the information shared around the data lake in the book
No of pages: 5
Erscheinungsjahr: | 2020 |
---|---|
Genre: | Informatik |
Rubrik: | Naturwissenschaften & Technik |
Medium: | Taschenbuch |
Inhalt: |
xvii
222 S. 134 s/w Illustr. 222 p. 134 illus. |
ISBN-13: | 9781484262511 |
ISBN-10: | 1484262514 |
Sprache: | Englisch |
Ausstattung / Beilage: | Paperback |
Einband: | Kartoniert / Broschiert |
Autor: |
Khattar, Pankaj
Chawla, Harsh |
Auflage: | 1st ed. |
Hersteller: |
Apress
Apress L.P. |
Maße: | 254 x 178 x 14 mm |
Von/Mit: | Pankaj Khattar (u. a.) |
Erscheinungsdatum: | 09.10.2020 |
Gewicht: | 0,46 kg |
You can connect with him on LinkedIn at [...]
Covers the life cycle of data, from building pipelines to data analytics and visualizations
Provides use cases for real-time and batch mode processing
Shows you how to infuse machine learning into real-time and batch mode data analytics pipelines
Chapter Goal: The chapter introduces the readers to the concept & need of a data lake in this big data environment.The chapter also covers how to create a data lake & architecture patterns to be followed for data lake analytics.
No of pages 15
Sub -Topics
1. Relational and non-relation data stores
2. Base for data: relational and non-relational databases
3. Warehouses of data: data warehouses
4. Markets for data: data marts
5. Introduction to data lake
6. Need to create a data lake
Chapter 2: Data Just Got BiggerChapter Goal: Today, enterprises have mix of relational and non-relational stores. However, when it comes to analyzing all this data - there must be a neutral platform which can understand these types of data. This introduces us to modern world concepts of distributed data storage & processing. It also talks about data sciences & machine learning concepts & how they are revolutionizing the data analysis world.
No of pages : 20
Sub - Topics:
1. Massively parallel processing, distributed data and spark the Hadoop
2. Distributed systems vs massively parallel processing systems (MPP)
3. Respective use cases for distributed and MPP systems
4. Science for data
5. Learning of machines
6. Overview of data analytics and advanced data analytics Chapter 3: Emergence of Cloud Lakes
Chapter Goal:The chapter enlighten the users with multiple cloud-based technologies available which are scalable, agile and performance in terms of computation, storage & analytics options. It goes into details about the suggested architecture on Microsoft Azure to solve Modern data warehouse, analytics use cases.
No of pages: 20
Sub - Topics:
1. Data travels to Cloud with added benefits
2. Overview of phases of data analytics architecture
3. Available products under each phase on Microsoft Azure
Chapter 4: Phases in Managing Data Analytics PipelineChapter Goal: This chapter covers in-depth context of this book. After we understand the background, this chapter will provide understanding of what are the phases of building entire data analytics pipeline. All the phases discussed in this book are critical to understand and any analytics solution will adhere to this common principle some way or the other. In each phase, there are different solutions to cater respective issues.It covers the data life cycle from upstream to downstream applications.
No of pages: 20
Sub - Topics:
1. Real time and batch mode data processing
2. Phases in data Management
· Ingest
· Store
· Analytics
· Visualization
3. Cloud data lake architecture patterns
Chapter 5: Data Ingestion in the LakeChapter Goal: The chapter talks about the limitations about the traditional storage & how the big data technologies has emerged as the champion in solving the limitations & changing the concepts of Extract, Transform & Load (ETL) to Extract, Load & Transform(ELT).
No of pages: 20
Sub - Topics:
1. Traditional limitations, can big data help?
2. ETL now becomes ELT
3. Tools in cloud for data ingestion
· Azure Data Factory on Microsoft Azure
· SQL server integration services on-premise
4. Overview of partner solutions for ETL/ELT - Informatica PowerEdge
Chapter 6: Data Storage & FarmingChapter Goal: The chapter shares with readers that how once the data is available in storage layers, how it can be grown & real time data storage & analysis needs can be catered, it also talks about batch & real time data processing & storage.
No of pages: 20
Sub - Topics:
1. Grow the data
2. Role of Azure data lake store, Blob, relational and non-relational stores
3. Architecting the Lambda & Kappa
4. Manage storage for real time and batch processing
Chapter 7: Analyzing the Bigger Data in Real TimeChapter Goal: Analysis of data is crucial for enterprises to get the business insights from the historic, present & future data to make descriptive, streaming & predictive analytics. In this chapter, we will specifically talk about real time analytics. Components required to perform real time analytics and how to optimize the cost using Azure PaaS solutions.
No of pages: 30
Sub - Topics:
1. Need of real time analytics
2. Approach to build data analytics on data lake for real time processing
3. Leverage event hubs/IOT hubs as a queuing solution on Azure
4. Why Edge computing and digital twins are gaining limelight
5. Choice between PaaS vs IaaS solution for streaming data processing
6. PaaS - stream analytics or spark streaming
7. Infuse R and Python on real-time data analytics pipelines
8. Use cases for real time analytics
Chapter 8: Analyzing the Bigger Data in Batch ModeChapter Goal: Analysis of data is crucial for enterprises to get the business insights from the historic, present & future data to make descriptive, streaming & predictive analytics. Analytics can help companies identify new business opportunities and revenue streams which results in an increase in profits, new customers, and improved customer service.
No of pages: 30
Sub - Topics:
9. Role of big data and massively parallel processing systems
10. Approach to build data analytics on data lake for batch processing
11. Approach to build data analytics solution for real time analytics
12. When to leverage HDInsight and Spark clusters
13. Infuse R and Python in data analytics pipelines
14. How it's different from conventional data warehousing and massively parallel processing solutions
15. Use cases for batch mode processing
Chapter 9: Visualization and Other Downstream ChoicesChapter Goal: Visualization of data is crucial for reporting& also to perform exploratory data analytics. The chapter talks about the visual elements like charts, graphs, and maps, data visualization tools which provide an accessible way to see and understand trends, outliers, and patterns in data
No of pages: 10
Sub - Topics:
1. Visualizations tools - Power BI
2. Downstream applications - LOB applications, notification applications
3. Choice of data stores for downstream applications - Cosmos DB, Azure SQL Database
Chapter 10: Summary of Data Lake components in AzureChapter Goal: The chapter takes a dig at multiple azure components which makes its easy to create an enterprise data lake in cloud & talks about in details the usage of each
No of pages: 20
Sub - Topics:
1. Azure data factory
2. Azure data lake storage
3. Azure HDInsight
4. Azure databricks
5. Azure data warehouse
6. Azure PowerBI
Chapter 11: Conclusion
Chapter Goal: The concluding chapter summarizes the information shared around the data lake in the book
No of pages: 5
Erscheinungsjahr: | 2020 |
---|---|
Genre: | Informatik |
Rubrik: | Naturwissenschaften & Technik |
Medium: | Taschenbuch |
Inhalt: |
xvii
222 S. 134 s/w Illustr. 222 p. 134 illus. |
ISBN-13: | 9781484262511 |
ISBN-10: | 1484262514 |
Sprache: | Englisch |
Ausstattung / Beilage: | Paperback |
Einband: | Kartoniert / Broschiert |
Autor: |
Khattar, Pankaj
Chawla, Harsh |
Auflage: | 1st ed. |
Hersteller: |
Apress
Apress L.P. |
Maße: | 254 x 178 x 14 mm |
Von/Mit: | Pankaj Khattar (u. a.) |
Erscheinungsdatum: | 09.10.2020 |
Gewicht: | 0,46 kg |