Data Engineering & Ice Cream, Together At Last
Hello! As always, this newsletter is a rundown of content you may have missed: articles, networking/community events, tips and tricks. This month we'll discuss Boston's wealth gap, Women in Consulting, Chart Champ, what a faulty commercial freezer can teach us about data engineering (seriously!) and more.
Unpacking Boston's Wealth Gap: Part II
The next section of the dashboard extends from the first article on visualizing redlining in Boston's Suffolk County to understanding affects of those policies on future populations. Section 2 addresses Minority population growth and the movement of racial groups over generations. I'll illustrate the process of extracting the data necessary for the analysis behind the visualization.
Let me tell you, this was probably the most difficult part of the dashboard from a conceptual view. I wanted to show the movement of minorities over several years but I originally spent too much time thinking about the chart type. Sankey chart, Sunburst maybe, it had to be captivating or attention grabbing. Yet, I was too enamored with the front end and had not yet grasped the underlying analysis. Until I stumbled upon the concept of "Majority Minority", a term describing when one or more racial or ethnic groups make up a population. With this in mind the analytic journey brought me to focus not just on overall population growth but the composition of it. Further, how could redlining in the first section play a role here? Lets discuss!➡️
Upgrading to Professional Pipelines: The Case for dbt Cloud
dbt (Data Build Tool) has quickly become a key piece in the dizzying puzzle of applications that make up the modern data stack. Serving as the answer to the T in ELT (Extract, Load, Transform), dbt is used by a myriad of data teams to help apply software development practices to transformations in their data pipelines.
Two flavors exist when we talk about dbt. There’s dbt Core, which exists as a free, open source command line tool, and dbt Cloud, their subscription based cloud offering. This raises the question when evaluating tools of which to use.
In this article, we will explore why a data team may want to consider the use of dbt Cloud and what differentiates it from its open source counterpart. There are four main factors that we will examine; orchestration, git integration, documentation, and configuration.
Is it time to make the move to dbt Cloud? Read more ➡️
Meltdown: What Small Business Issues Can Teach Us About Data Engineering
Like many people in today’s day and age, on top of being a Data Engineer, I have a side hustle. More specifically, about 6 months ago, I purchased an ice cream shop. Small business ownership has its ups and downs, but few are more interesting than what happened this week.
On any normal day, I don’t have to deal with too much for this business, I have a great management team that is able to handle most things that are thrown at them. Unfortunately, this was not one of those days. I received a call early in the morning from my general manager informing me of a massive freezer malfunction. This had caused all of the ice cream in our walk in freezer (approximately 320 gallons worth) to melt.
After a few moments of panic, our team quickly got to work. We first started by breaking down the tasks that needed to be done, these tasks can be shown as three categories, assessment, repair, and remediation.
Upcoming Events
dbt Boston Meetup
Join the data engineering community on September 27 and hear two more stories of end-user experiences with dbt. We will predominantly focus on people's experience with dbt, but will also discuss broader topics related to data teams, such as data stacks, data ops, modeling, testing, and team structures.
Come hang out with some awesome data practitioners, have a bite to eat, and hear great presentations
Capturing History in dbt
Daniel Allen, Lead Data Engineer, Rapid7
Our story of dynamically capturing history on Snowflake tables through our dbt models Summary: Capturing history is a common and valuable practice within data engineering. It is useful for understanding the lifecycle of your data, for auditing changes, and to see point-in-time snapshots of datasets. Using Snowflake and dbt, I will talk through how we implemented dynamic history capture that can be easily applied to any model within dbt. I'll discuss the design and the technical implementation of our history capture process along with the benefits and limitations of our approach.
Recommended by LinkedIn
Talk Title TBA
Jason Hall, Senior Solutions Architect, Upsolver
A big thank you to our friends at Rapid7 for hosting this event!
Chart Champ
The 8th annual Chart Champ brings together 500+ data and analytics professionals at Boston University's Metcalf Trustee Center. Enjoy food and drinks, the annual dashboard competition, and new this year: Data Strategy Sessions. Hear how Boston Scientific, Barrett Distribution Centers, and Teradyne are using the power of data to transform their organizations. Reserve your space today ➡️
Hear Presentations From:
The Dashboard Competition
Learn more and register today at cleartelligence.com/chart-champ-2023-event
Women in Consulting Panel Discussion
Hear the stories of some amazing consultants, engage in inclusive, dynamic conversation, and meet likeminded professionals in a casual atmosphere. Register today to reserve your spot ➡️
Moderated by Katy Sandlin McMahon , this event is designed to bring together women from all facets of the consulting world to discuss challenges, success stories, and unique perspectives on this wide-ranging and fast-paced industry.
THANK YOU to our Distinguished Panelists:
Learn more and register today at cleartelligence.com/women-in-consulting
Resources
❄️Snowflake Operational Cost Calculator
When it comes to cloud migration, cost is a primary factor for even the largest firms. You need to have a good idea what your annual storage and compute costs are going to be to make the most informed decision possible. And there are so many variables! What if data science needs more compute? What if load frequency needs to be adjusted?
We've taken the stress out of all of that with the Snowflake Operational Cost Estimator. Plug in your unique inputs and you will quickly see the forecasted annual costs for storage and compute. The Snowflake Operational Cost Calculator gives a directional estimate intended to facilitate cost forecasting for individual warehouse deployments.
📈Tableau Accelerators
Growth, sustainability, workforce, and inflation are amongst the top concerns of today's executive leaders. Monitor your most crucial KPIs quickly and easily with our dynamically configurable dashboard solution.
According to the National Retail Federation, retailers lose $24 Billion annually to return fraud. We've created an easy way to not only identify, but predict return activity in your establishments.