Sign up to receive new updates

Latest

Mar
06

Alert on symptoms, not causes

When you are bringing a new system to production you know that you ought to define SLIs, set up instrumentation,
5 min read
Feb
16

How about we forget the concept of test types?

I have found that the concept of test types (unit, integration, and so on) does more harm than good. People
7 min read
Jan
09

How organisations cripple engineering teams with good intentions

I believe that engineers are at their best when they complement strong technical expertise with skills from other disciplines such
10 min read
Feb
12

Migrating an Eureka-based microservice fleet to Kubernetes

I have written about how a lot of the value in the our internal Platform at Adevinta goes in the glue between systems. This post gives a deep dive into some of the technical problems we find as we transition teams into our Kubernetes-based PaaS, and what kind of glue helps us overcome it.
14 min read
Jan
20

How to build a PaaS for 1500 engineers

This article is based on a presentation I gave as part of AdevintaTalks in Barcelona on November 2019, explaining the strategic principles we used to build an internal platform to support all the online marketplaces in the group (including some of the biggest in Europe and South America).
17 min read
Oct
22

"Kubernetes made my latency 10x higher!".. or maybe not?

As we migrate teams over to Kubernetes, I’m observing that every time someone has an issue there is a knee-jerk reaction like the title. Kubernetes is to blame. Investigation usually shows that the explanation boils down to the nuances of blending complex systems together.
7 min read
May
29

Sizing Kubernetes pods for JVM apps without fearing the OOM Killer

Migrating teams to from on-prem/EC2 infrastructures to Kubernetes we hit some issues with resource allocation of JVM apps. I will explain how running in Kubernetes forces us to think about capacity planning more than we’re used to, changing some of the assumptions we made before containers.
9 min read
Apr
28

GC forensics by example: multi-second pauses and allocation pressure

This post will analyze a Hotspot GC log exhibiting large GC pauses (> 1 min) leading to allocation pressure and system load as a cause of pathological behaviour on Hotspot’s garbage collector.
12 min read
Jan
30

How does the default hashCode() work? (and why does it affect biased locking?)

In which scratching the surface of hashCode() leads to a speleology trip through the JVM source reaching object layout, biased locking, and surprising performance implications of relying on the default hashCode().
11 min read
Aug
28

Contention on sun.misc.Cleaner

I found recently a (dying) JVM with about 10 threads BLOCKED on the sun.misc.Cleaner class instance. This was not the root cause of the failure, but I could learn something by looking into those blocks.
3 min read