Saturday 27 January 2018

StockReader: Respecting Robots.txt and Initial Scraping

Robots has some rules to follow, or break if they are nasty. Those rules are defined in the robots.txt file on the particular web site.

I'll start by creating a stub python script that will access a web page and print that to a console.

First, I need to install a module: requests. To do that, I need to install a Python package manager, PIP:
In https://bootstrap.pypa.io/, download and run getpip.py.

To install a package,
Navigate to the folder where pip.exe is located.
Run pip install requests

Now, requests can be imported and used

In my case, I'll use DI.se. The corresponding robots.txt file will be analyzed. If that page allows, I'll download the stock list and analyse the stocks in the list

Saturday 20 January 2018

TrafficControl: Locking Tracks for Trains - Tests and Visualization - QML

The visualization of locks in QML will be dynamic, with a green/red dot at the start/end station if the track is locked:.

  • FREE: If the track is free, no dots will be shown. The track will be green. 
  • END/START: If the track is locked in both directions, red dots will be shown at both end and start station. The track will be red.
  • LOCKED: If the track is locked at start or end station side, a red dot will appear near that station and a green dot will appear at the eother station. The track will be yellow.
When the lock status is changed. a signal is sent to QML with the new state of the track.



If the state goes from FREE to any other state, two dots are created close to the start and end station. To avoid overlayed dots, the QML part will search the QML tree for existing dots and add new signal dots only if there are no existing dotsfor that particular track..

If the last train leaves a track, that track will be set to FREE and the corresponding signal dots will be removed in QML.

When removing QML items, it takes some time for those changes to take place. That caused some strange behaviour when a train arrived to a station, where another train is waiting for the same track:
Train 1 arrives to station 1 from track 1, existing dots are removed from track 1.
Train 2 enters track 1. QML looks for signal dots for track 1 and finds them as the removal is still in process. The signal dots are repainted.
The signal dots for the track 1 are finally removed.

Saturday 13 January 2018

StockReader: Re-implement in Python

One of my first larger software projects was StockReader. That is a C++ web scraper for stocks.

The final result of StockReader
Now, I need to re-implement the program for four reasons: 
  1. My firewall doesn't like blocks executables that sends several get requests
  2. The source code is horrible and needs to be refactored completely in order to be more maintainable
  3. I should be able to use regular expressions.
  4. The program should be more automatic, selecting stocks automatically
I will not publish the program on Github, but I'll post some issues that I will discuss.

The first step will be to implement a script that gets a web page from an URL. From that html code, I'll extract the URLs for the key number sections for the stocks.

When creating code, it is crucial to understand what one is writing. It is important to resist the temptation to copy some code from stackoverflow and instead focus on really understand what one is writing.

In the Python documentation, there is a section about regular expressions.

Saturday 6 January 2018

TrafficControl: Locking Tracks for Trains - Tests and Visualization

After adding locks for tracks, I need to both visualize them and test them.

The tests are quite straight-forward, but they require some code.
  • The first test case, trainListAddRmUpstream verifies that is it possible to add two trains in upstream on the same track, with a distance of more than 3000 m between the trains. It also verifies that the first train is removed first.
  • The second tedst case, trainListAddDownstream verifies that a train that wants to enter a used track, where a train is moving in the opposite direction, will wait until the first train has left the track.
For the visualization, I'll change both model/view and the QML part.

Model/View:
A new column will be added for each track: Lock. The different states will be: Free, Start, End and Locked.

When something is changed for a track, the method Track::emitChangedSignal(trackID) is called. That method compiles a QStringList that contains the messages that shall be sent to the datamodel. The lock status is added as a string to the string list that is sent to the track data model. Also, a column is added for the data model.

Three tracks are locked. For example, dHja_Lun_E is locked at the end station (LundC). This means that trains can enter at Hjarup but not at LundC.
The next blog post will describe how I added the locks in the map.