Stinger Delivers Fast Hive SQL via YARN & TEZ

The latest BigSQL bundle now includes Stinger and Tez to dramatically improve performance and extend the SQL dialect. In the latest distribution of BigSQL Release 2, it is extremely easy to get started exploring Tez and Stinger in your own sandbox environment. Taking advantage of YARN, Stinger is now delivering on its three main goals of Speed, SQL Enhancememnts & Scale..



The Stinger initiative (Hive 0.13 & Tez-0.3) calls for a 100X performance over current Hive (Hive 0.10) batch queries. Key developments helping to achieve these results are:

  • Leveraging ORC Files these columnar formats allow for faster read, writes and processing of Hive data. The format supports predicate pushdowns, compression and even allows light-weight indexes to be stored within the files.
  • Continued Optimizations in the Hive Query Engine Continuous enhancements include avoiding unneeded writes, support for hot containers (saves launch shutdown overhead), in memory cache, leveraging wider Metastore statistics have all helped improve latency and throughput.



Full support DECIMAL, TIMESTAMP, DATE & VARCHAR datatypes as well as a rich set of semantic improvements.



Designed for large scale data environments, Stinger’s goal is to deliver even when you are analyzing Petabytes of data.


Stinger in Action!

Below is a screenshot from the BigSQL YARN console showing the execution of a Tez job.

At OpenSCG we focus on helping you get the most value and leverage out of your data environments. Its an exciting time in DATA and we thrive on helping our clients understand the latest changes in data technology. We realize that there are great opportunities in new NoSQL and clustered data solutions and take pride in helping you evaluate your data options.