Power BI - Excel vs CSV vs MySQL - a Real Case
In this short article, I want to comment on a real case of performance issue and how we solved it.
Case
We received from a client an Excel file containing a data sample to develop a new BI application.
As the first activity, we started to analyze the content, columns, data formats, nothing special, and we connected the Excel file as a Power BI Desktop application data source. Again, nothing different from what we did before in a similar situation.
However, we noticed that the data load on Power BI was slow, very slow.
The Excel file has a usual design, columns with labels and values, and about 82,000 records. The file size is 11 MB.
Could this be the reason?
But we have models much bigger than this.
The notebook used was a Ryzen 7 with 16 threads and 32 GB of memory, enough power to do the task easily.
Recommended by LinkedIn
Analyzing the issue
We made tests with interesting results:
Conclusion
We finished the day with the clear idea that the problem with Excel was the file's opening and reading process. Considering the compression level, the CSV file with the same content is double the size, the uncompressing process could be consuming much time.
On the other hand, the MySQL connection had an incredible performance if we think about the complexity of establishing a database connection and collecting data across the Atlantic Ocean (I'm in Portugal).
So, I finished with the tip:
To perform better, avoid connecting big Excel files. Convert them to CSV or use a database.