RONAN - WHERE DO THE DUTCH GET THEIR CARS?
logo

WHERE DO THE DUTCH GET THEIR CARS?

The Dutch vehicle authority (RDW) keeps track of all the vehicles that are registered in the Netherlands. The dataset that results from this contains a lot of information about the vehicles and is openly available to the public. In this blog post, I analyse the data from this dataset to find out where cars in the Netherlands originate from.

In recent times, there has been a lot of discussion about Chinese car brands entering the European market at very low prices compared to their European counterparts. This made me wonder to what extent these Chinese car manufacturers have already penetrated the Dutch market and how this compares to manufacturers from other countries. Therefore, I decided to use the RDW vehicle registration dataset to create an overview of the countries of origin of cars in the Netherlands. This blog post will explain the steps that have been taken in this analysis and the results that were found.

The dataset

The dataset that I have used for this analysis is called the "Gekentekende voertuigen" dataset, which translates to "Registered vehicles". This dataset can be found here on the RDW open data portal and downloaded by everyone. The dataset is saved in a CSV format with 98 different columns with each row containing data for a single vehicle. For this analysis, I have mainly focused on the "Merk" (brand) column, which contains the brand name of a vehicle.

Data Analysis

The data analysis was done in multiple steps of which the first one was the creation of a shell script to extract only the brand column from the large dataset. The shell script also creates a list of unique car brands that are present in the dataset as to later be able to map these to their country of origin. The second step was to create a Python script that reads in these lists of car brands to match each vehicle to a country of origin. This Python script makes use of the Folium library to then output the results in an interactive map that shows the number of vehicles from each country.

Shell script

The first step in this analysis is thus the creation of a shell script. The shell script reads in the original CSV dataset and proceeds to output multiple new CSV files. The most important of these new CSV files is the one that only contains the brand column of the dataset. It is important to note that the rows are also filtered to only include passenger cars, as the dataset also contains vehicles like motorcycles and trucks. By filtering only vehicles of the "Personenauto" and "Auto" type, it is ensured that only passenger cars are included in the analysis. The script also only selects brands for the unique list that have at least 100 registered vehicles as to filter out any unknown or very rare brands of which the country of origin cannot be established easily. As the CSV file with each brands country of origin has to be created manually, this filtering step saves a lot of time. The missing brands also have very little impact on the final results as they only make up a very small fraction of the total amount of registered vehicles.

Python

Once the previously mentioned CSV files have been generated by the shell script, a Python script is used to read in these files and perform the analysis. The Python code first reads in both the CSV file with all the vehicle brands as well as the CSV file that maps each brand to a country of origin. The dataset itself is read in chunks to reduce memory usage and stored in a pandas DataFrame for further processing. For every chunk of the dataset, the brand of each vehicle is matched to its corresponding country of origin using the mapping CSV file. As they are matched, a count is kept of the number of vehicles from each country. When the results of a chunk have been processed, they are aggregated to the total count for each country. The final step then is to create an interactive world map of the results by using the Folium library.

Folium

The Folium library allows for the creation of interactive maps in Python by leveraging the Leaflet.js library. In this analysis, Folium is used to create a minimalistic choropleth map that shows the worlds political borders. These borders are then mapped to the counts of vehicles from each country and then assigned a color based on this total amount of vehicles from that country. The resulting map allows for zooming and panning to explore the data in more detail. When hovering over a country, a tooltip is shown that displays the countries name and the amount of vehicles from that country.

Results

The maps below show the final results of this data analysis. The first image shows a static map and the second one is an interactive version of the same map which can also be opened in full screen by clicking the "VIEW" button below it. By looking at the data it is clear that the majority of cars in the Netherlands originates from European countries. Germany is by far the largest supplier of cars in the Netherlands with 3.4 million, followed by Japan and France with 1.8 and 1.7 million respectively. The United States is the fourth largest supplier with 0.8 million cars followed by South Korea with a slightly lower amount. Maybe somewhat surprisingly, there are also brands from countries like Latvia, Serbia and Austria present in the dataset. These countries are all responsible for around 150 vehicles with brands like Zastava from Servia, Puch from Austria and RAF from Latvia.

static result image 2025

Conclusion

In conclusion, this analysis of the RDW vehicle registration dataset has provided some interesting insights. It is clear that German brands are still responsible for the largest portion of cars in the Netherlands. However, there is also a large presence of brands from other continents with countries like Japan, South Korea and the United States achieving a combined amount of around 3.5 million vehicles. This shows that the Dutch car owner is not necessarily loyal to European based brands and is open to vehicles from other parts of the world as well. With the rise of Chinese car manufacturers in recent times, it will be interesting to see to which degree these brands will be able to penetrate the Dutch market. At this moment in time though, their presence is still relatively limited with not even 50.000 vehicles being registered in the Netherlands. It is however important to note that Chinese brands are still relatively new to the European market which means that their market share is likely to increase in the coming years.

My current plan is to update this analysis on a yearly basis, starting in January of 2026, to keep track of how the market shares of each brand change over time. It will be interesting to note whether Chinese brands will significantly increase their market share and if so, at whose expense this will be. At the bottom of the page you can find the "CODE" button which links to the Github repository for this analysis if you want to take a closer look at the code that has been used.

Limitations

It is important to note that there are some limitations to this analysis. Some of the brands found in the filtered dataset are for example trailer manufacturers or companies converting existing vehicles into camper vans. These entries in the dataset occur in very low numbers though and therefore have very little impact on the final results. It could however be questioned whether these trailer manufacturers have actually registered a car or whether the converted camper vans should not be counted towards the original brand of the vehicle. Due to limitations in this dataset however, it is not possible to filter such entries without a lot of manual work. Besides this limitation, brands with fewer than 100 vehicles have also not be taken into account due to the earlier mentioned reasons.