Feature Engineering in SQL and Python: A Hybrid Approach
https://towardsdatascience.com/feature-engineering-in-sql-and-python-a-hybrid-approach-b52347cd2de4
I knew SQL long before learning about Pandas, and I was intrigued by the way Pandas faithfully emulates SQL. Stereotypically, SQL is for analysts, who crunch data into informative reports, whereas Python is for data scientists, who use data to build models. Although they are almost functionally equivalent, I’d argue both tools are essential for a data scientist to work efficiently. From my experience with Pandas, I’ve noticed the following:
Those problems are naturally solved when I began feature engineering directly in SQL. So in this post, I’ll share some of my favorite tricks by working through a take-home challenge dataset. If you know a little bit of SQL, it’s time to put it into good use.