Latest
Release of Cloudera Enterprise Brings Better Performance and Operations
Efficiency Across Workloads and Users
Release of Cloudera Enterprise Brings Better Performance and Operations
Efficiency Across Workloads and Users
Cloudera
Enterprise 5.7 Improves Data Processing with Hive-on-Spark Support and Provides
Visibility into Multi-Tenant Usage
Enterprise 5.7 Improves Data Processing with Hive-on-Spark Support and Provides
Visibility into Multi-Tenant Usage
Singapore,
April 25, 2016 — Cloudera, the global provider of the fastest, easiest, and most
secure data management and analytics platform built on Apache Hadoop and the
latest open source technologies, today announced the general availability of
Cloudera Enterprise 5.7. This new release provides leading performance across
key workloads – including an average 3x improvement for data processing with
added support of Hive-on-Spark, and an average 2x improvement for business
intelligence analytics with updates to Apache Impala (incubating).
Additionally, this release adds visibility into multi-tenant usage across these
workloads for management efficiency and optimal resourcing. Cloudera Enterprise
5.7 is another leap forward for Hadoop as it grows to support new and changing
use cases, and indicative of Cloudera’s leadership in ensuring these modern
enterprises can fully embrace the platform across the business.
April 25, 2016 — Cloudera, the global provider of the fastest, easiest, and most
secure data management and analytics platform built on Apache Hadoop and the
latest open source technologies, today announced the general availability of
Cloudera Enterprise 5.7. This new release provides leading performance across
key workloads – including an average 3x improvement for data processing with
added support of Hive-on-Spark, and an average 2x improvement for business
intelligence analytics with updates to Apache Impala (incubating).
Additionally, this release adds visibility into multi-tenant usage across these
workloads for management efficiency and optimal resourcing. Cloudera Enterprise
5.7 is another leap forward for Hadoop as it grows to support new and changing
use cases, and indicative of Cloudera’s leadership in ensuring these modern
enterprises can fully embrace the platform across the business.
“Hadoop has evolved significantly in the past ten years, and
with every advancement, we see the potential for new applications and use
cases, while improving what’s already being done,” said Charles Zedlewski, vice
president, Products at Cloudera. “The advancement of data engineering and ETL
development with Hive-on-Spark marks a critical milestone in this evolution –
further solidifying Spark’s status as the standard data processing engine in
Hadoop. Data engineering is only a part of the story in today’s business though
and, with the 5.7 release, our customers can better enable a wide range of
users across the platform, all while maintaining fast performance, easy
management, and compliance-ready security.”
with every advancement, we see the potential for new applications and use
cases, while improving what’s already being done,” said Charles Zedlewski, vice
president, Products at Cloudera. “The advancement of data engineering and ETL
development with Hive-on-Spark marks a critical milestone in this evolution –
further solidifying Spark’s status as the standard data processing engine in
Hadoop. Data engineering is only a part of the story in today’s business though
and, with the 5.7 release, our customers can better enable a wide range of
users across the platform, all while maintaining fast performance, easy
management, and compliance-ready security.”
ETL development and batch processing remains one of the most
common use cases for Hadoop. Apache Hive has long played a key role for these
workloads, though traditionally leveraging MapReduce as the underlying
execution engine. However, with its easy development and faster performance
compared to MapReduce, Apache Spark is playing an increasingly important role
and is primed to replace MapReduce for these workloads. Last year Cloudera
launched the One Platform
Initiative as the
roadmap to complete the transition from MapReduce to Spark and they are leading
development to better integrate Spark with Hadoop – ensuring it meets the
enterprise requirements for even the largest-scale production workloads. With
the release of Hive-on-Spark in Cloudera 5.7, it brings Spark one step closer
as developers can now leverage the powerful data processing capabilities of
Spark, while continuing to use familiar Hive, and delivers a 3x performance
improvement on average. Hive-on-Spark is a community-driven initiative launched
by Cloudera, IBM, Intel, MapR, and others, and involved customers across a
range of industries – including, advertising, financial services, and insurance
– as part of an early access program for further development.
common use cases for Hadoop. Apache Hive has long played a key role for these
workloads, though traditionally leveraging MapReduce as the underlying
execution engine. However, with its easy development and faster performance
compared to MapReduce, Apache Spark is playing an increasingly important role
and is primed to replace MapReduce for these workloads. Last year Cloudera
launched the One Platform
Initiative as the
roadmap to complete the transition from MapReduce to Spark and they are leading
development to better integrate Spark with Hadoop – ensuring it meets the
enterprise requirements for even the largest-scale production workloads. With
the release of Hive-on-Spark in Cloudera 5.7, it brings Spark one step closer
as developers can now leverage the powerful data processing capabilities of
Spark, while continuing to use familiar Hive, and delivers a 3x performance
improvement on average. Hive-on-Spark is a community-driven initiative launched
by Cloudera, IBM, Intel, MapR, and others, and involved customers across a
range of industries – including, advertising, financial services, and insurance
– as part of an early access program for further development.
For further consistency, Cloudera has worked with their 2,300+
partner ecosystem to ensure customers can continue to use the leading data
integration and preparation tools with Hive-on-Spark, without disrupting the
business. Partners such as: BMC, ClearStory Data, Elastic, NGDATA, Solix,
Trillium Software, Zementis, and others are working with Cloudera to certify
their technologies for a seamless transition. (See below for their supporting
statements.)
partner ecosystem to ensure customers can continue to use the leading data
integration and preparation tools with Hive-on-Spark, without disrupting the
business. Partners such as: BMC, ClearStory Data, Elastic, NGDATA, Solix,
Trillium Software, Zementis, and others are working with Cloudera to certify
their technologies for a seamless transition. (See below for their supporting
statements.)
Being able to support multiple use cases across the same, shared
data within a single cluster is a key benefit for Hadoop. With Cloudera
Enterprise, administrators can easily provide these users and applications with
the right resources to run and meet critical Service Level Agreements (SLAs).
With this recent release, these administrators get full visibility into
historical usage and efficiency across users, tenants, and applications. The
new Cluster Utilization Reporting feature, built-into Cloudera Manager ensures
efficient operations and proper resource allocation between groups and workload
types; helps guarantee SLAs are being met; and provides simple troubleshooting
of job and query performance issues.
data within a single cluster is a key benefit for Hadoop. With Cloudera
Enterprise, administrators can easily provide these users and applications with
the right resources to run and meet critical Service Level Agreements (SLAs).
With this recent release, these administrators get full visibility into
historical usage and efficiency across users, tenants, and applications. The
new Cluster Utilization Reporting feature, built-into Cloudera Manager ensures
efficient operations and proper resource allocation between groups and workload
types; helps guarantee SLAs are being met; and provides simple troubleshooting
of job and query performance issues.
Additional features in Cloudera 5.7 include:
● 2x
performance improvements for BI analytics: Impala continues to maintain its performance
lead as the fastest analytic SQL engine for Hadoop through dynamic partition
pruning, faster query startup, runtime filters, and more
performance improvements for BI analytics: Impala continues to maintain its performance
lead as the fastest analytic SQL engine for Hadoop through dynamic partition
pruning, faster query startup, runtime filters, and more
● Simplified
path to production: Cloudera
Manager includes cluster templates that provide a simple workflow to easily
replicate configuration settings to new clusters – making it easy to move from
a well-tuned test environment to production, scale-out across regions, or
quickly revert to a known good configuration when problems occur
path to production: Cloudera
Manager includes cluster templates that provide a simple workflow to easily
replicate configuration settings to new clusters – making it easy to move from
a well-tuned test environment to production, scale-out across regions, or
quickly revert to a known good configuration when problems occur
● Optimized
data governance: Cloudera
Navigator opens up data management and governance to the business user with
simplified lineage for establishing trust and provenance of data, and adds
managed metadata for improved discoverability and consistency across systems
data governance: Cloudera
Navigator opens up data management and governance to the business user with
simplified lineage for establishing trust and provenance of data, and adds
managed metadata for improved discoverability and consistency across systems
Cloudera 5.7 is now available on www.cloudera.com/downloads
For the LATEST tech updates,
FOLLOW us on our Twitter
LIKE us on our FaceBook
SUBSCRIBE to us on our YouTube Channel!