Datazip reposted this
Big thanks to Sandeep Devarapalli for his in-depth exploration of Apache Doris and for highlighting its key features. 👍 We encourage users to find out whether Doris is the right fit for their specific use cases and share their experience. For those seeking a general overview of Apache Doris, we recommend starting with this talk given by the Apache Doris PMC Chair: https://lnkd.in/g7byRjp5 For those who have specific questions regarding Doris, we invite you to join the our Slack community. This is where you can engage with other Doris users and meet our support team, who will be happy to provide help and guidance! 🙌 https://lnkd.in/ghMuVZW2
Is Apache Doris set to outpace ClickHouse in the analytical database arena? As claimed by Doris [ their official blog post, link in comments], ClickHouse is not designed for multi-table processing, so you might need an extra solution for federated queries (Cross-database query without data migration) and multi-table join queries (big claim) Doris is good at high-concurrency queries and join queries, and it is now equipped with an inverted index to speed up searches in logs. Doris supports multi-table joins natively, whereas ClickHouse, which is optimized for single-table analytics, may require an external solution (like a data virtualization layer or federated query engine) to achieve similar cross-table processing On top of it, In a test done by an e-commerce SaaS provider, Doris outperformed ClickHouse in 10 of 16 queries, delivering up to 30x faster execution. 4B Rows (Full and filtered Join Queries): Doris was up to 2-5x faster than ClickHouse (faced memory issues), with performance gaps increasing on larger dimension tables (over 10x). 25B Rows (Full and Filtering Join Queries): Doris completed queries in seconds, whereas ClickHouse took minutes or failed on large tables (over 50M rows). 96B Rows (Large-Scale Queries): Doris handled all queries effectively; ClickHouse couldn’t execute these at all. With newer feature in Doris v3 like, Compute-Storage Decoupled, Asynchronous Materialized Views, better Semi-Structured Data Management, memory optimizations for Parquet/ORC format read and write operations, ClickHouse might need to gear up at some point or risk losing some market share. With these advancements, Doris 3.0 is closing the gap with ClickHouse, especially in areas where SQL compliance and ease of use are critical. Orgs that prioritize standard SQL support and seamless integration might find Doris to be a more suitable fit. Is Doris set to eat into ClickHouse's market share? The signs are there, particularly as more enterprises prioritize compatibility and integration ease over niche performance metrics. A good thing for ClickHouse may be Google trends, Doris is yet to catch up to in terms of number of internet searches. At the end, Doris tightly integrates with the entire Apache ecosystem and suit of softwares, not so much can be said for ClickHouse (think workarounds) Would love to hear thoughts from others who've been hands-on with either of these systems. Are you considering a switch or evaluating Doris for your next project?
Thanks for the shoutout Apache Doris
SDE III (Data Engineer - Platform)
4wWould love to get detailed insights on how we come to conclusion (image). For e.g. comparing SQL compatibility, How do we come to conclusion that clickhouse has limited complex query support ? Which queries clickhouse does not support but other do ? Same for other FEATURE categories as well.