Salaries in German IT Branch – a Case Study of Critical Statistics Review

Recently I have read the results of salary survey among Russian-speaking  software developers in Germany, published on dou.ua. I was skeptical about the validity of conclusions and expressed my critics (a bit less polite than I should have done it). But the survey author reasonably pointed out that he did his best in his free time and did provide the raw survey data. Recalling a popular motto in Soviet Union: if you are disagree then criticize but if you criticize then do it better I try to interpret the survey results more correctly.

There are lies, damned lies, and statistics (Mark Twain (among others))
The only statistics you can trust are those you falsified yourself (attributed to Winston Churchill)

As I looked at the survey result, I immediately noticed several critical issues, namely:

1. The survey is overconcentrated in Berlin and Munich.
This might be explained by the fact that these two cities are indeed the capitals of start-ups but the start-up branch segment cannot be representative for the whole IT branch.
E.g. Karlsuhe is another big IT region, however, for the established companies.
2. Age distribution.
I don't doubt the genuity of the data, since the highly skillful migrants from ex-USSR (and, likely, most of them are from Ukraine) are usually young. But once again this subgroup is not representative for the whole branch.
3. Skill levels
About 1/2 of developers are under 30, whereas 2/3 of them are senior?!
Yes, in (ex)-USSR students graduate much earlier, approximately in the age of 22. And yes, programming (with very few exceptions) is a craft rather than science (in other words you don't need much education to code html).
But the seniority means not only software skills but also e.g. soft skills, which come only with living experience.
Thus likely the survey participator exaggerate their skill level a little bit.
4. And last but definitely not least, the salary levels.
a) I am quite aware of what German companies are ready to pay (there is little individual meritocracy and a lot of collectively agreed pay scale grouping).
b) Similar salary statistics for Berlin and Munich is likely a nonsense (not only due to much higher living costs in Munich, but also due to Berlins motto "Berlin is poor but sexy".
Actually, the following review will be concentrated around this last point, since I have already commented on 3. and (without conducting a new survey) can hardly do something about 1. and 2.

So:
1. First of all there are not so much data, thus a data scientist shall not be lazy and should screen the data manually. Doing this I found enough duplicates like this
2. From my own extensive experience I know that a typical German company is reluctant to pay more than €65000 to a software developer (an exceptional specialist might expect €75000). The domain knowledge is rewarded to some extent but it is implausible to assume it (by default) by young migrants.
However, there are a lot of Freiberufler (contractors) in German IT. They are paid much better (minimal rate is €55/hour, a typical rate is about €65/hour). So let us have a closer look at the salaries higher than €75000 p.a.
First I try to identify contractors (they are market with F[reiberfuler]). The easiest case to formally conclude that a person is a contractor is when a person currently earns less than a year ago (contractors have no fixed salary). Of course this might also be the persons, who changed to a start-up, but... it is not typical for Germany, even in start-up capitals.
Further approach to identify contractors is to have a closer look at position specification. E.g. it is generally known that embedded C++ programmers are mostly contractors.

As a next step I mark with x the entries with high but not implausible salaries, which might be due to position (CEO, CTO, Teamlead), domain knowledge (Data Science, Computer Vision, Project Manager) or exceptionally long working experience.

Finally, as you can see on the screenshot, there are not so many entries that are considered to be implausible. But there are some: e.g. I will never believe that a PHP developer (even Senior, even a contractor) is earning €120K in Berlin. And even if, this is a definite outlier!

3. Finally, what about the puzzle "Munich vs. Berlin"?
Well, the quantile-quantile and the box plot solve it, at least partially.
As you can see, the medians are approximately equal, however, there are significantly less salaries below €60K in Munich (since you can hardly survive their with a lower salary).

Like this post and wanna learn more? Have a look at Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

FinViz - an advanced stock screener (both for technical and fundamental traders)