Loading…
This event has ended. Create your own event → Check it out
This event has ended. Create your own
Register Now for ApacheCon North America 2014 - April 7-9 in Denver, CO. Registration fees increase on March 15th, so don’t delay!
View analytic
Wednesday, April 9 • 11:15am - 12:05pm
Developing the Tez Execution Engine for Pig

Sign up or log in to save this to your schedule and see who's attending!

Apache Pig is a programming language and execution runtime for doing petabyte scale processing with MapReduce. One of the major recent developments in the Hadoop ecosystem is the introduction of Apache Tez, a successor to MapReduce which provides major performance enhancements and a more natural foundation for Pig. The Pig-on-Tez project aims to dramatically increase the throughput of data pipelines written in Pig by using Apache Tez as the execution engine instead of MapReduce. By running atop the Tez framework, benchmarks of representative queries have sped up 2-3x when compared to MapReduce.

In the second half of this presentation we’ll explain how LinkedIn, Netflix, Hortonworks, and Yahoo have successfully collaborated over a 6 month period to deliver a major rewrite of critical infrastructure, providing significant benefits for both themselves as well as the community at large.

Speakers
avatar for Cheolsoo Park

Cheolsoo Park

Senior Software Engineer, Netflix
Cheolsoo Park is an Apache Pig PMC member and committer. He is also a senior software engineer at Netflix and works on cloud-based big data analytics infrastructure that leverages open source technologies including Hadoop, Hive and Pig. Cheolsoo holds a Bachelor’s degree in Computer Science from the University of Waterloo and is fascinated by large scale data processing, distributed systems, and cloud computing.
avatar for Mark Wagner

Mark Wagner

LinkedIn
Mark Wagner is a committer on the Apache Pig project and a contributor to many other projects in the Hadoop ecosystem. He is passionate about distributed systems, programming languages, and machine learning. Mark holds Bachelors’ degrees in Mathematics and Computer Science from University of California, Santa Cruz and is a member of LinkedIn’s distributed analytics infrastructure team.


Wednesday April 9, 2014 11:15am - 12:05pm
Confluence C

Attendees (6)