Power BI Desktop is one of those “magic tools” that everyone should know who occasionally or regularly processes, transforms, or evaluates data. Reading, transforming, analyzing, presenting and creating relationships between our data has never been easier than with Power BI Desktop.
In Power BI, many of these of tasks can be done just by drag ‘n drop or clicking, so that even those who switch from Excel can get very, very far without having to deal with the two integrated languages M and DAX .
For some time now, Power BI Desktop allows us using the script languages R and Python. Both languages are widely used in data analysis, are well documented and thus offer possibility to increase the “effectiveness” of Power BI.
So it’s time for an intro in Python in Power BI.
To use Python in Power BI, we need a Python installation on our PC. I recommend a current Python 3 version.
We can check whether the installation was successful by opening the command prompt (press the Windows key, type
cmd and press Enter). In the prompt enter
python --version. The installed Python version should now be displayed here.
Since I had some problems with Python 3.8, I switched to 3.7. and later back to 3.8. I haven’t tested version 3.9 yet. Don’t give up! Sometimes you have to try something until it works.
If it does not work, the reasons can be very diverse. Here are just three tips for troubleshooting:
- Installation itself showed error messages? — Search google for the error
- Python command is not known to the computer? — Restart and try again
- Python command is not known to the computer? — You may have a problem with the Python path and the environment variable, solutions for this are available by using Google.
If the installation was successful, we install Pandas, a Python library for data analysis and Matplotlib, a library for visualizing data.
pip install pandas
pip install matplotlib
Please note, company proxies and VPN networks occasionally cause trouble here.
Together with pandas, numpy and one or the other useful library will be installed. This is enough for the first steps, we can finally install additional packages at any time later.
Preparing Power BI
We can now finally open Power BI and prepare it for the use of Python.
In the options, we go to Python scripting and check whether Power BI has got the right path to our Python version. Usually this should work automatically, otherwise we enter the correct path here.
Note: There are still various instructions circulating in which settings had to be made under the item “Preview features”. As of December 2019, this is no longer necessary as the Python integration has grown out of the stage of a preview feature.
Optional: create virtual environment
In Python it is possible to create virtual environments. Sounds crazy, but it is perhaps the most important thing a Python beginner should know beyond the actual syntax.
The advantage is that in this environment we can install all sorts of libraries that we need for our project (Power BI) without coming into conflicts with other Python projects on our computer that may need older or newer versions of the same libraries. The virtual environment creates (roughly speaking) a “virginal” Python environment for our projects.
It can therefore make perfect sense to install and set up your own virtual environment for Power BI:
//create a folder tmppy
D:\>cd tmppy //install a virtual environment
D:\tmppy>.\env\Scripts\activate //activate it (venv)
D:\tmppy>pip install pandas
D:\tmppy>pip install matplotlib
D:\tmppy>deactivate //deactivate it
In Power BI we can then tell the programm to use our virtual environment as follows..
Testing the Python implementation in Power BI …
Now it is time to test Python in Power BI.
One possible use case of Python is as a query. I wrote a detailed article about this (unfortunately currently only in German, but soon also in English), but we don’t need this knowledge for our test purposes.
Instead, we create a new query.
We choose Python as the query type and enter the following code in the query window.
Here again the code as text, in case you want to do copy ‘n paste.
import pandas as pd
data = pd.DataFrame([["Hallo", "World"],["Power", "BI"]], columns=[“Cool column 1”, “Mega cool Column 2”])
The result should look like this.
By the way, if something went wrong, we’ll get an error message as shown below.
Excursus: Problems with Python
In addition to simple syntax errors, missing imports or typing errors, strange error messages can occur.
An error that occurs frequently at the moment is described here on stackoverflow:
In my experience, it only helps to experiment with different Python and the package versions.
What to do next?
If setting up Python worked, now it’s time to use Python.
I write more articles about that, soon.
By the way, if the above code is completely alien to you, learn a little Python and the pandas dataframe.