Monitoring trains the sysadmin way
After discovering that the site viaggiatreno.it and lefreccie.it
kindly offers some API to their train data i decided to implement my own
monitoring system to get a complete overview of what is happening
in the public train system and never miss a train.
Master Plan:
- Scrape all data available (Train departure/arrival,delays,stations....)
- Standardize the format so i can implement pluggable systems (Grafana, Telegram Bot, Website, Twitter..)
- At least have fun when i hear "We are sorry for the inconvenience" while i check my systems
Scraping all the relevant datasets
All the data is collected with a script every 30 minutes using as
input the site APIs and station lists, the ouput will be saved into
InfluxDB (Legit, delay time tracking with timeseries DBs)
and a local folder for historical data that i will use later with git.
Standardize format
To allow multiple systems communicate together you always need to take raw data (train datasets) and standardize it into a more pluggable format:
- InfluxDB (Pros: A lot of client support, Grafana, Alerts, SQL... Cons: Some more resources usage)
- Git+Local (Pros: Efficent historical data tracking and easy full-text search... Cons: None)
Developing "Pluggable" systems:
- The Grafana Dashboard gathers all the relevant metrics that i cherry
picked from InfluxDB (Train departure/arrival, delay, last station
detected, train number, timestamp).
With this datasets i could easily create a dashboard that can really
give you all the information that you can see on station information
display systems.
- The telegram bot https://t.me/Trenordalerts_bot
is written in Go and with under 300 lines of code is it possible to
read all the collected data of delays and communicate them to the user.
The alerting part of the bot is more complex than the "Give me info of
xyz train" because i need to identify the user before sending an alert
(obvious.. i don't want to receive alerts of my friend's train) so i
implemented
a connector with a relational DB where i track chat_ids and alerts.
- Static Website https://trenistats.it
This is where the magic happens , the flow of the data is very simple now and i just need to gather data from one of my inputs (InfluxDB,Git or Local Dir) and show some graphs. How? The html code is automatically generated via a running script that collects the data from the local repo and generates the index.html for the static site. Even if i'm not a frontend specialist i managed to make something cool out of it (pure google fu skills and design 101) - Twitter bot https://twitter.com/trenistats
The bot gathers information from the local repo and triggers alerts via Twitter APIs if trains start/have long delays and tagging the main Italian company that's responsible for the transportation system. Like the Telegram bot this is written in Go with very few lines, basically take what i have already did for Telegram + Twitter API integration.
That's it, i could have gone far into dashboarding and alerting but this setup seems to work fine for me.
I tried integrating Elasticsearch + Kibana for more fun stuff but Influx
+ Grafana did the job very well (it just works.. and no json decoding
fights).
Comments
Post a Comment