Headed to #kubecon next week? Join us for a private hands-on workshop featuring Kelsey Hightower, former distinguished engineer at Google. 🔥 Seats are limited. We’ll recreate real-world application failure and work to detect and mitigate issues before the clock runs out. 🛠 Designed for experienced SREs and developers, you'll learn essential and advanced techniques to find, troubleshoot, and fix problems in containerized applications. Register here: https://lu.ma/wk1foczq (while space is available) cc: Tony Meehan and Lyndon K. Brown #reliability #kubernetes #incident #sre #platformengineering #devops #engineering
Prequel.dev’s Post
More Relevant Posts
-
https://lnkd.in/dchY4VDH Introduction: Site Reliability Engineering (SRE), introduced by Google in the early 2000s, is a crucial discipline that bridges the gap between software development and IT operations, offering higher operational reliability and improved system performance...... #SiteReliabilityEngineering #SRE #DevOps #ITOperations #SystemReliability #AutomationInOps #ErrorBudgets #SLOs #SoftwareEngineering #OperationalEfficiency #TechInnovation #CloudComputing #DigitalTransformation #TechCareers #InfrastructureManagement
To view or add a comment, sign in
-
Think #SRE is just for tech companies? Think again. We've helped modernize critical infrastructure in the public sector, enabling government agencies to deliver more reliable, efficient services to citizens — all while optimizing public resources. Discover how SRE principles are making waves beyond Silicon Valley in our latest blog: https://lnkd.in/dScUwPxw #platformengineering #SRE #DevOps #SLI
To view or add a comment, sign in
-
Want to learn about the traffic management practices of the hyperscalers, and why they may now be relevant to your business?
Are you a Site Reliability Engineer, Software Engineer or Engineering Leader looking to build and scale systems with reliability in mind? You're not going to want to miss this rollercoaster 🎢 Join our upcoming panel to hear traffic experts from the hyperscalers discuss topics like: principles of high availability, important software resilience patterns, and how to avoid common causes of overload. 👉 REGISTER HERE: https://lnkd.in/d3bt95XB hashtag #webinar #sre #devops #stanza #sitereliability #sitereliabilityengineering
To view or add a comment, sign in
-
🚀 Day 3/45: Embarking on a DevOps Journey! 🚀 As I wrap up my 45-day challenge to dive into the world of DevOps, I'm thrilled to share some of the incredible milestones I've achieved along the way. 🔹 Virtual Machines: I delved deep into the fundamentals of virtual machines, understanding how they operate, their architecture, and how they play a pivotal role in creating isolated environments for running applications. This knowledge has given me a solid foundation in managing infrastructure more efficiently. 🔹 Hypervisors: Learning about hypervisors was a game-changer. I explored both Type 1 and Type 2 hypervisors, understanding their differences, and how they help in virtualizing hardware to create multiple virtual machines. This has opened up new perspectives on resource allocation and optimization. #DevOps #LearningJourney #VirtualMachines #Hypervisor #CareerGrowth #TechJourney #ContinuousLearning For more insights click on the given link:-
To view or add a comment, sign in
-
Eye-Opener: #FinOps is the next big thing in platform engineering. (Not my words, but from one of my mentors who has been working in this space since the early days when Site Reliability Engineering was still exclusive to Google.) Personally, I’ve been somewhat overlooking #FinOps as a formal practice, even though I’ve had the opportunity to work on a few FinOps engagements throughout my career and have seen its impact firsthand. Yet, at the end of the day, building tools to understand cloud expenditures and then embedding a recommendation system to re-evaluate architectural decisions doesn’t always feel exciting. However, my perspective recently shifted after starting a project where I’ve been tasked with engineering a multi-cloud management platform. Of the feature set, 40% directly relates to what we do in a FinOps engagement. While developing this platform, I’ve noticed that many of the "cool" features can be implemented through open-source software. But when it comes to FinOps, there's a notable gap in open-source tools that can truly deliver business value. This is exactly where, as my mentor mentioned, the next revolution in platform engineering is likely to emerge. A final thought: I’m aware that many large tech companies have already built sophisticated tools and practices to achieve what we think of as FinOps today. But imagine if we could build a platform that not only handles orchestration and management but also incorporates cost control in a meaningful way. It could dramatically transform how we approach cloud adoption and modernization.
To view or add a comment, sign in
-
Logs, traces, metrics: the holy trinity of observability. But here's the kicker - most teams only use one or two. In my years as a Platform Engineer, I've seen countless incidents that could've been solved faster with all three. Here's why each matters: 1. Logs: Your system's diary. Captures events as they happen. 2. Traces: The GPS of your requests. Shows the journey through your system. 3. Metrics: Your system's vital signs. Quantifies performance over time. Using all three? It's like having X-ray vision for your infrastructure. Example: Recently, we slashed our MTTR by 40% just by correlating metrics with traces. Pro tip: Start with one, master it, then add the others. Progress beats perfection. What's your go-to observability tool? Share below! #Observability #DevOps #SRE
To view or add a comment, sign in
-
What are the most common causes of system overload? How do you protect against them? 🤔 Tune in to a panel of Traffic experts like Niall Murphy (CEO of Stanza), Tobias Weingartner (SRE @ Google), and John Reese (ex-Robinhood SRE) discuss reliability practices of the hyperscalers, and why your organization can also benefit from adopting them. 👉 REGISTER HERE: https://lnkd.in/gWVx8c4y #webinar #sre #devops #stanza #sitereliability #sitereliabilityengineering
To view or add a comment, sign in
-
The early days of Google infrastructure presented unique challenges. On Tech on the Rocks, Brian Grant discusses how the reliance on single-threaded C++ in a pre-multi-core world shaped the architecture and performance of Borg, laying the groundwork for Kubernetes. Gain valuable insights into the evolution of distributed systems: https://lnkd.in/gZHRm5ir #kubernetes #borg #C++ #singlethreaded #performance #scalability #softwareengineering
To view or add a comment, sign in
-
I’m reading about the internal release engineering process at Google and I think I like that a lot how they automate their build and deployment process with an internally built tool called Rapid. I like their rapid system architecture and is very similar to Flink Architecture. The Rapid Worker is pretty much similar to the Flink JobManager which act as the dispatcher of submitted Flink Applications via the front end UI of the Flink JobManager this can be analogous to the Rapid Service. While the Raid Tasks can be extended to be like Flink TasManager which communicates with an external systems to execute the submitted workflow. If I join Google today as a cloud infrastructure engineer, I’ll really want to understand rapid architecture more to know how many tasks it can run concurrently and improve that by building a dedicated TaskManager to make that scalable with self-service taskslot allocation during workflow submission. A lot of companies don’t really have a consistent and reliable build process, just cherry picking different tools with short term goals, no clear path to sustainability. When I have some time might want to replicate that and improve it for some of my works. Brilliant idea. #cloudengieering #cloudcomputing #softwareengineering #coding #programming #devops
To view or add a comment, sign in
-
Interested in learning how SREs are reshaping the tech landscape? Join us tomorrow, April 30th, at SKILup Day: SREs and the Rise of Platform Engineering to explore the evolution of SREs in platform engineering. https://bit.ly/4az7z8R We'll discuss how SREs are reshaping DevOps workflows at scale, playing a crucial role in driving efficiency and resilience. At #SKILupDays, you'll learn: ➡ How SREs fortify systems against disruptions ➡ How SRE-led solutions effortlessly adapt to evolving demands ➡ How to embrace automation and data-driven decision-making to maximize operational efficiency Don’t miss out! Register now to gain valuable insights and actionable strategies to empower your journey to SRE and platform engineering. https://bit.ly/4az7z8R PeopleCert #SREs #PlatformEngineering #DevOps
SKILup Day SREs and the Rise of Platform Engineering
techstrongevents.com
To view or add a comment, sign in
375 followers
Co-Founder & CEO at 3Mór | Dependabot for the entire technical stack, where we find, filter, and fix for DevOps teams.
1moHaving seen Kelsey speak and tracking what you all are doing at Prequel, this is going to be a must see event!!