Agree & Join LinkedIn
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Create your free account or sign in to continue your search
or
New to LinkedIn? Join now
Learn what batch and stream processing are, how they differ, and what are the pros and cons of using them for big data analysis.
Learn about the current and future trends in big data architecture and how to design a solution that meets your business and data needs.
Learn some best practices and tips to encrypt, authenticate, authorize, audit, isolate, and update your MapReduce data and code in this article.
Learn how to keep up with the latest trends and developments in Oozie and Airflow for big data. Find tips and resources on how to learn, track, and compare them.
Learn what Kerberos is, how it works, and why you should use it for securing your Big Data applications. Get tips on configuration and troubleshooting.
Learn how to optimize Kafka and Flume parameters and configurations for performance and scalability, and avoid common pitfalls and bottlenecks.
Learn six key aspects of spark streaming optimization, such as batch size, data partitioning, checkpointing, backpressure, fault tolerance, and monitoring.
Learn how big data and IoT can help you collect, analyze, and act on data from multiple sources, and what challenges and solutions you need to consider.
Learn how to leverage logs, metrics, tracing, testing, debuggers, and profilers for debugging Big Data applications using various frameworks and tools.
Learn how to test and debug NoSQL databases for social media and web applications with these tips and tools.
Compare Oozie and Airflow based on their features, pros, and cons for managing big data workflows. Learn how to choose the best tool for your use case.
Learn key metrics and best practices to assess and improve your metadata and catalog strategy for big data assets.
Learn what streaming and real-time data processing are, how they differ from batch processing, and what are some of the most common use cases and scenarios for big…
Learn how to define data quality metrics, implement data validation and cleansing, and use checkpoints and state management with spark streaming.
Learn how to use metrics, logs, checkpoints, backpressure, debugging, testing, tuning, optimization, monitoring, alerting, and best practices for spark streaming…
Learn how to fix syntax, performance, compatibility, quality, and security issues in SQL for big data queries with tips and tools.
Learn how to use unit testing and integration testing for Big Data projects effectively and efficiently. Discover the best practices and common challenges of…
Learn how to protect your data from unauthorized access or disclosure in ETL and ELT frameworks, and what trade-offs to consider when choosing between them.
Learn how to balance factors such as throughput, availability, latency, data skew, consumer groups, and topic growth when choosing partitions for your Kafka topics.