Publish An Interactive Data Visualization Dashboard On A Website – All The Solutions I Tried
As you can see, this website was constructed with WordPress and had been posted on the virtual host of GoDaddy. The contract guaranteed 3-year access, but I shifted the work to SiteGround about halfway through. Luckily, I’m also up to some side projects about dashboards, which could keep the SiteGround server productive with some automatic programs or simply data storage.
I’m definitely not an expert in server or host management, but I assume that data loading and computing would need large resources. Thus, the separation of the computing and the web hosting server may do no harm, given that it prevents the collateral damage caused by data overflow in the former one.
Another strength of this strategy would be the timely backup of data. Though the data used to feed the dashboard is mostly free and accessible, making the backup process more or less unnecessary. The program code could be backed up on your computer with GitHub, and one could just download the missing data all over again when things go south. But in contrast, I back my website up on a daily basis. After all, one could never be too careful with his own business.
Purpose of This Article
Hosting a dashboard on your own computer, or the so-called local end is not really a difficult task. But operating the same project on websites would be a lot more complicated. The text below concerns the setbacks I’ve encountered in my earlier attempts, practical countermeasures I’ve figured out, and my ultimate resolutions.
My dashboard project had acquired initial progress when I was writing this essay, but actual results were yet to be achieved. Further progression and new resolutions will be updated in this post. Any comments and feedback are all welcome.
By the way, though the term “local” may sound a bit geeky, it would still be used in this article in order to distinguish our own computers from the computers which host websites.
To begin with a project, you must have your own perspective on a certain agenda. In my first attempt, I’d like to design an interactive dashboard about the closing price of the produce market. Data resources in “.json” format are available on the open data platform, making it easier to acquire historical data.
Despite the success in this attempt, I believe that it may not be the case in the second or third one. There would definitely be some data that should be collected with crawlers. As the saying goes, one should always be prepared for the worst. Only by that could one establish an SOP with sufficient flexibility that is readily able to be applied to all projects.
Python or R?
The most renowned programming languages in data analytics would definitely be Python and R. Both of them have the capability to analyze and visualize data; the related tutorial, forums, and modules are equally accessible online. It’s difficult to judge which one is superior to the other. The difference in cost of learning is not one of my considerations, e.g. time, or money, because I already had basic knowledge about both candidates. But, when it comes to versatility, Python would doubtlessly be my choice.
Then, what do I mean by versatility?
First, just as I’ve mentioned, there are some occasions when we just need the crawler code to acquire certain raw data, and that is the strength of Python. There are some crawler libraries for R, but the feedback from their users is not very…. positive if you know what I mean. After all, it is just reasonable to build web crawlers with Python, isn’t it?
Second, when it comes to the actual application, R is mostly applied to conduct statistical analysis. I’ve heard others are using R to code programs such as“snake” or “minesweeper.”However, it just seems like a waste of talent in my opinion. On the contrary, Python is widely used in various domains. In addition to data analysis, some programmers would also create websites or apps with Python.
Taking these two facts into consideration, it’s obvious that Python is taking the upper hand. That’s why I chose Python as my programming language with no hesitation. And the following text will also focus on various issues with the Python application.
Where to Display Your Work
This is not actually a problem for me, given that I’ve made up my mind to post the project on my own website. But a clarification of this is still needed in order to make the reasoning process clearer.
If you’re going to post your project on your own website, then you definitely need to deal with scut works like website management and server maintenance before you can do so. If you don’t have your own website, then free or charging cyberspace could be considered. R users could upload their content to RPubs, just like the COVID-19 real-time update made by my colleague. I haven’t actually searched for the Python counterpart, but I believe there are similar resources available.
If you post your dashboards on cyberspace, such as RPubs, then traffic issues would not be your concern. But for those who post on their own server, management of the traffic must not be taken lightly. The raw data used could be large in size; by that, I mean dozens to a few hundred MBs. When clients are browsing your website, your server needs to upload the exact same size of your data, which could be horrible. Imagine your project goes viral somehow, and countless browsers are scrolling through your website. But then your server gets jammed with such traffic, and the internet service provider would send you some “regards”, asking whether you are going to upgrade your capacity or not. And if you ignore these KIND regards, your website may just be shut down. If you’re going to argue that your internet service provider claimed that there would be no traffic limitation, I’d say they are just being polite.
In general, traffic management is the most problematic part when constructing the workflow, and it would ultimately confine the options for data visualization.
The visualization tools that come with Python are readily available, such as Plotly. Yet after taking the traffic issue and system stability into consideration, Power Bi seems to be the most reasonable option. But things just went sideways.
Power BI is a visualization tool released by Microsoft, and I’ll just skip the introduction here. In my initial plan, I would upload Python files to the server and have it automatically updated and saved every day. Meanwhile, Power BI would also acquire that updated work from the server, and save it in Microsoft’s database. Given that there would only be one daily update, a file up to 100 MB would only take up 3 GB of traffic in total. Not even a problem, is it? After all these, I’ll just have to create an interactive dashboard with Power BI and utilize the “Publish to web (public)” function to embed the iframe output on the website.
This may sound elegant. But the thing is, after the early 2020 update, access to the embed code could only be granted with the permission of Power BI admins.
Some online tutorials reveal that one could fix this issue by clicking the cog icon on the right-top corner of the screen, going through Settings > Admin portal, and finally clicking on a mysterious button. But this turns out to work with premium account users only. In other words, ordinary or Pro account users could no longer embed their work on any website publically. The only good news is that previously generated codes would remain unaffected.
The community initially took this as a bug that comes with the update, but after going through the comments of this article, it seems that Microsoft executed this on purpose. Perhaps this is their countermeasure against the traffic crisis. I’m just now hoping that Microsoft will grant Pro users access to embed codes. 10 USD per month for the code is pretty much a fair deal, but surely it’s not my decision to make. In summary, Power BI is not readily available at this time.
Tableau is another reputed BI (business intelligence) software that is often compared with Power BI of Microsoft. Putting usability aside, the cost of Tableau is just way too unfriendly for individual users. Thus, unfortunately, that would not be my option.
Realizing that the former two popular candidates are not suitable, I keep on searching for substitutions on a smaller scale.
Grafana Labs is an open-source product that only charges enterprise users while leaving individual users untouched, though with certain limitations. Users can choose to upload to either their servers or the servers of Grafana Labs. However, one could only upload 5 dashboards at most when choosing the latter.
I’d personally take extra caution while working with unfamiliar brands or products by having a peek at the data traffic of their website through SimilarWeb, in case they just collapse in a blink of an eye. As a result, Grafana Labs proves itself to be competent with its 700k visits per month.
To summarize, Grafana Labs just earned its seat on my list.
Freeboard is somehow similar to Grafana Labs, but there’s no free access to it. Plans with prices starting from 12 USD per month are offered, which are not really expensive. It’s noteworthy that freeboard does not mention whether the files could be uploaded to their servers, but with a 30-day trial, one could just try it out if needed.
In terms of its competency, SimilarWeb doesn’t reveal the amount of data traffic of freeboard’s website, which means only minimal traffic has taken place. In my experience, data would be displayed on SimilarWeb if a website reaches 50k visits per month. As a result, freeboard may not be that popular yet, but it seems to be able to serve as a spare.
Installation of Python and Its Module on the Virtual Host
After picking Grafana Labs as our visualization tool, I began with the installation process of Python on the GoDaddy server. And an error just popped up during application installation.
Could not open requirements file: [Errno 2] No such file or directory: ‘python/pip’ You are using pip version 9.0.1, however version 20.0.2 is available. You should consider upgrading via the ‘pip install –upgrade pip’ command.
I upgraded the pip accordingly, but it failed again. It may be the result of the wrong command or another technical issue, which is beyond my understanding. It also reveals another potential hazard. The virtual host provided by GoDaddy is a Linux one, which could only access cPanel but not root. It means that even if I managed to overcome the present issue, there might be a lot more waiting for me in the near future. And according to my understanding, I could only run all the Python files after uploading them to the host, unlike IDE or IPython Notebook, in which I could selectively run some of them. It’s extremely inconvenient when testing or debugging (perhaps I’ve been trying the wrong way). All in all, I was forced to look for another way all over again.
PythonAnywhere – A Specific Resolution for Python
After evaluating the potential risk of continuing my work on a virtual host, I’ve considered shifting to a dedicated host, GCP, or Azure. In addition to the more expensive cost, dealing with complicated technical issues is just way too troublesome for me. I was also wondering that with all these people doing data analytics, there should be a resolution specifically made to fit our needs. And finally, I came across PythonAnywhere, a company that offers host management services specifically for Python. Most of the host service providers would claim that their servers are perfectly compatible with Python. But in my opinion, compatibility with Python is not their primary concern, and thus the relative technical support may not be that wholesome. This is why I decided to shift my work to SiteGround, instead of working with GoDaddy. The former clearly focuses on compatibility with WordPress, and their customer service is just timely and satisfying.
Plans and Their Costs
PythonAnywhere is not expensive either. The starting price is as low as 5 USD per month, while multiple customizations are allowed to fit the users’ specific needs (That’s exactly what I demand!). I have paid for the membership and also successfully deployed the Plotly interactive dashboard onto the host. These could be found in the tutorial video Deploy Dashboard With Python And Plotly/Dash On PythonAnywhere and my portfolio.
Extensibility and Stability
In June 2022, Anaconda declared its acquisition of PythonAnywhere, which makes me a lot more confident about the upcoming development and stability of its services. For more details, please click on the following link to Business Wire, Anaconda Acquires PythonAnywhere to Expand Python Team Collaboration in the Cloud.