Descripción del producto
This practical book is a complete guide to installing, configuring, and managing Pentaho Kettle. If you’re a database administrator or developer, you’ll first get up to speed on Kettle basics and how to apply Kettle to create ETL solutions—before progressing to specialized concepts such as clustering, extensibility, and data vault models. Learn how to design and build every phase of an ETL solution.
- Shows developers and database administrators how to use the open-source Pentaho Kettle for enterprise-level ETL processes (Extracting, Transforming, and Loading data)
- Assumes no prior knowledge of Kettle or ETL, and brings beginners thoroughly up to speed at their own pace
- Explains how to get Kettle solutions up and running, then follows the 34 ETL subsystems model, as created by the Kimball Group, to explore the entire ETL lifecycle, including all aspects of data warehousing with Kettle
- Goes beyond routine tasks to explore how to extend Kettle and scale Kettle solutions using a distributed “cloud”
Get the most out of Pentaho Kettle and your data warehousing with this detailed guide—from simple single table data migration to complex multisystem clustered data integration tasks.
The ultimate resource on building and deploying data integration solutions with Kettle
Kettle is a scaleable and extensible open source ETL and data integration tool that lets you extract data from databases, flat and XML files, web services, ERP systems, and OLAP cubes. It provides over 120 built-in transformation steps to validate, cleanse, and conform data, as well as numerous options to load data into data warehouses and many other targets. Kettle is a comprehensive, low-cost alternative to traditional data integration tools like Informatica PowerCenter, IBM InfoSphere DataStage, and BusinessObjects Data Integrator.
This book explains in detail how to use Kettle to create, test, and deploy your own ETL and data integration solutions. You'll learn to use Kettle's programs to create transformations and jobs, use version control, audit data, and schedule your ETL solution. Then you'll progress to more advanced concepts such as clustering and cloud computing, real-time data integration, loading a Data Vault model, and extending Kettle by building your own plugins. In addition, you'll find hands-on examples and case studies that show exactly how to put Kettle's features into practice.
Explore the components of the Kettle ETL toolset
Discover how to install and configure Kettle and connect it to various data sources and targets
Design and build every aspect of an ETL solution using Kettle
Learn how to load a data warehouse with Kettle
Understand the steps for deploying and scheduling ETL solutions
Gain the skills to integrate Kettle with third-party products
Learn to extend Kettle and build your own plugins
Use clustering and cloud computing to scale and improve the performance of your Kettle ETL solutions
Find out how to use Kettle for real-time data integration