Data Platform News (November 2024)

Data Platform News (November 2024)

Oh, boy, what a month it was! Several major events, bold statements and tons of announcements from all three vendors - Snowflake, Microsoft and Databricks. November was a tough month to track all the things going on! Anyway, I'll do my best to wrap it up and share the most significant updates for all three platforms. So, here we go...

Snowflake

  • Snowflake had their Build 2024 event dedicated to developers in November. Here are major announcements from this conference: 1) Snowflake Intelligence (in private preview soon) - a new platform backed by Snowflake Cortex AI and Snowflake Horizon Catalog that will enable organizations to easily ask business questions across their data to and create data agents that take action on insights, 2) Snowflake Open Catalog (generally available) - allows users to easily adapt as the needs of their organization evolve by integrating new engines and applying consistent governance controls across them, 3) Leaked Password Protection (generally available soon) - credential theft prevention and detection by automatically disabling users’ passwords discovered on the dark web, 4) observability for ML Models (public preview) - users can quickly detect model degradation in production with built-in monitoring, 5) Cortex COMPLETE Multimodal Input Support (private preview soon) - enhance conversational apps with multimodal inputs like images, 6) Snowflake Connector for Microsoft SharePoint (public preview) - tap into Microsoft 365 SharePoint files and documents and automatically ingest files without having to manually preprocess documents, 7) SPLIT_TEXT_RECURSIVE_CHARACTER function (private preview) for text chunking, 8) AI Observability for LLM Apps (private preview) — with technology integrated from TruEra (acquired by Snowflake), 9) several improvements to Cortex Analyst (public preview), including simplified data analysis with advanced joins (public preview), increased user friendliness with multi-turn conversations (public preview), and more dynamic retrieval with a Cortex Search integration (public preview), read the blog post by Sri Chintala to learn more, 10) Cortex Playground (now in public preview), an integrated chat interface designed to generate and compare responses from different LLMs so users can easily find the best model for their needs, 11) Internal Marketplace (generally available) - users can discover available data, apps, and AI products from other teams and business units within their organizations, 12) fine-tuned large language models (LLMs) sharing (public preview) - making it easier for them to collaborate on generative AI use cases with increased model accuracy and performance for use case-specific tasks, 13) Snowflake Native App Framework Integration with Snowpark Container Services (generally available on AWS and public preview on Microsoft Azure) - allows users to build apps in their preferred programming language with fully customizable user experiences and then deploy them on top of configurable GPU and CPU instances.

Internal Marketplace (image courtesy of Snowflake)

Sample SQL code to create an Iceberg table in Fabric OneLake

Microsoft Fabric

SQL database in Fabric

  • For more technical details on Fabric new features and updates from November, see the Fabric November 2024 Feature Summary on the Microsoft Fabric Blog. My favorite features in November update: 1) small multiples for the new card visual, 2) text slicer, 3) metric sets (gather measures and their common slice and dice scenarios in one place), 4) TMDL extension for VSCode (develop semantic models in VSCode), 5) tenant switcher (at last - no logging off to switch Fabric tenant!), 6) support for spaces and special characters in Delta table names, 7) integration with Esri ArcGIS in Fabric Spark, 8) table and partition refresh in Semantic Model Refresh activity of data pipelines, 9) new Fabric events in Real-Time Hub (you will be able to use Activator to trigger data pipelines via the Reflex when new files show up in OneLake), 10) AutoML UI in Fabric Data Science workload (see a video with Estera Kot, PhD and Misha Desai to learn more: Low Code AutoML UI in Microsoft Fabric Data Science).

Metric set in Fabric

Databricks

Multi-page reports in AI/BI Dashboards (image courtesy of Databricks)

  • Also, Databricks announced a new notebook integration with AI/BI Dashboards. This new capability allows developer to easily transition from exploratory data analysis done with notebooks into dashboards, avoiding context switching and recreating visual artifacts in multiple places.
  • Another workload where Databricks invest a lot is Databricks SQL. Read a summary of what landed in this workload in October: What's new with Databricks SQL, October 2024. Some recent improvements mentioned in the blog post: query profiling, automated statistics, AI-generated comments, publishing to Power BI.
  • As a part of development in the performance optimization area Databricks announced gated Public Preview of Predictive Optimization for Statistics. According to Databricks Predictive Optimization delivers the following advancements: 1) intelligent selection of data-skipping statistics, eliminating the need for column order management, 2) automatic collection of query optimization statistics, removing the necessity to run ANALYZE after data loading, 3) once collected, statistics inform query execution strategies, and on average drive better performance and lower costs.
  • From the community, I read a nice blog post How to Effectively Manage Databricks Costs by Grace O'Halloran (a walkthrough of Databricks cost tracking with a roadmap of related Databricks features and a link to Power BI template for cost tracking).


Sources of news and updates

If you are looking for resources useful for staying up to date with Snowflake, Fabric and Databricks, see a list I shared in the August 2024 edition of this newsletter.


My quick summary

Pretty busy month for Snowflake and Fabric, with some nice updates from Databricks as well. My quick summary of the last month:

  1. Snowflake remains consistent in prioritizing their investments in security, governance and AI. Cortex Analyst got significant improvements making it more competitive against text-to-SQL features from other vendors. Also, I think the way Snowflake makes AI features integrated and simplified (new chunking function coming, integration of Cortex Analyst and Cortex Search, etc.) can become their advantage. And of course, I can't wait to see the integration with Claude models, the ability to create agents with Snowflake Intelligence and the outcome of Datavolo's acquisition.
  2. Fabric got heavier with another workload landing in the product. Of course, Databases make the overall story of the platform even more compelling, but at the same time SQL databases in Fabric will consume the same capacities as all workloads for analytics. That's why I'm super happy to see a number of features related to administration and governance announced last month. I hope things like Surge Protection and Workspace Monitoring (Capacity and Tenant Monitoring soon?) will help improve the quality of platform admin/owner's life in Fabric. It's great to see the Real-Time Intelligence GA (this is for sure one of the most underestimated parts of Fabric). Also, we got tons of new features in November, but some haven't been rolled out to all regions yet. I find it a bit annoying to see newly announced stuff like Workspace Monitoring in one tenant, while it's not available in another one located in one of the major Azure regions.
  3. Databricks continue competing against Snowflake in the warehousing area investing more and more in Databricks SQL and serverless warehouses. Also, I see them gently stepping on Microsoft's toe by making AI/BI Dashboards more and more advanced every month (IMO, to say they effectively compete against Power BI right now would be definitely an overstatement). BTW, I remember one sentence from the original announcement of AI/BI back in June: "They (dashboards) also don't come with things you don't want – no cumbersome semantic models, no data extracts, and no new services for you to manage.". Interesting, huh?

That's all folks. As always, share in the comments all interesting updates, articles, videos etc. you found last month. Thanks for reading and until next time.

Oh, and if you're interested in seeing the latest features of Microsoft Fabric in action and you're located in Warsaw or neighborhood, join me at 142. Meeting of Data Community Poland in Warsaw on December 5th at 6PM CET. Register here by Tuesday noon: 142. WAW DCPL - Microsoft Fabric Post-Ignite Demo Festival.

Sandeep Pawar

Sr Power BI Architect | Microsoft Fabric | Microsoft Data Platform MVP

1mo

Thanks Pawel

Andy Cutler

Owner at Datahai BI Solutions | Microsoft Data Platform MVP | I help unlock the potential of Microsoft Data technologies

1mo

Great work collating and thanks for the mention

Grace O'Halloran

Microsoft Data Platform MVP | Principal Data Engineering Consultant at Advancing Analytics | Microsoft Certified Azure Developer & Administrator | International Speaker | DevUp Co-Founder | xPwC

1mo

Great roundup - thanks for the mention!

Jacob Rønnow Jensen

Head of Data Platform @ AP Pension | Leadership | Data Delivery | Microsoft Data Platform | Business Intelligence | Digital Transformation | IT-Strategy | Data Warehousing | Data Architecture | Collaboration

1mo

Thank you for the comprehensive update - happy to make the list.

To view or add a comment, sign in

More articles by Pawel Potasinski

Explore topics