[video:http://www.youtube.com/watch?v=JQcSTWY8FNE]
[video:http://www.youtube.com/watch?v=mzq-zLiOJWA ]
Big Data and Data Science projects looking to reduce the risks, costs, and nightmares associated with managing dozens of data feeds have discovered the ETL (Extract, Transform, Load) product category. But there's no such thing as a silver bullet, and while there are practices and lessons to be learned from ETL, the tools are mostly the legacy of early 90s thinking in which data feeds were fewer, the alternative were COBOL or C, and writing code was deemed risky by DBAs and management. Ken will show how a high-level language like Python, when matched with certain practices and design patterns can offer a very successful alternative to these diagram-driven development tools. The discussion will focus on concepts, designs and patterns, and will include examples of successes and failures with a small amount of code.
Bio
Ken Farmer is a data architect at IBM where he has built and led their Security & Compliance Data Warehouses. These projects used Python extensively for systems management, general data management, ETL, and analytics. He writes about data management at www.ken-far.com, and writes data analysis tools like DataGristle for fun on the side.