The latest BigSQL bundle now includes Stinger and Tez to dramatically improve performance and extend the SQL dialect. In the latest distribution of BigSQL Release 2, it is extremely easy to get started exploring Tez and Stinger in your own sandbox environment. Taking advantage of YARN, Stinger is now delivering on its three main goals of Speed, SQL Enhancememnts & Scale..
The Stinger initiative (Hive 0.13 & Tez-0.3) calls for a 100X performance over current Hive (Hive 0.10) batch queries. Key developments helping to achieve these results are:
- Leveraging ORC Files these columnar formats allow for faster read, writes and processing of Hive data. The format supports predicate pushdowns, compression and even allows light-weight indexes to be stored within the files.
- Continued Optimizations in the Hive Query Engine Continuous enhancements include avoiding unneeded writes, support for hot containers (saves launch shutdown overhead), in memory cache, leveraging wider Metastore statistics have all helped improve latency and throughput.
Stinger in Action!
Below is a screenshot from the BigSQL YARN console showing the execution of a Tez job.