Saturday, 31 August 2019

StockAnalyzer: Adding XML Reader to the Program

I've created a separate page for StockReader and StockAnalyzer, with a short summary of the programs. In this blog post, I'll use XML and C# dictionaries to verify that the stock names are correct for the stock records.

When feeding the stock records from the csv files, I need to verify whether the stock name I queried (the first string) matches the actual stock record that I got (the second string).

As I mentioned in my last post, I'll have three lists of valid, invalid and unknown combinations in a XML file. The program will read that file and populate corresponding dictionaries.

When the program reads a stock record, it will look for the stock names in the dictionaries.
  • If it is found in the white list, that stock record can be added to the database.
  • If it is found in the black list, that stock record is ignored.
  • Otherwise, the program will print a recommended xml tag. The user can decide on whether to add that tag to the white or black list.
The user will edit the XML file and move combinations from the gray list to the black or white list manually.

XMLReader vs XMLDocument in C#
XMLReader is a read-only representation of the XMLDocument. The reader requires more code but is less memory consuming. I've decided to use the XML


New Class: NameChecker
The name checking will be handled in a separate class, that is called from fileParser. I'll implement it in the next post.

Mapping the Wanted Stock Name to the fetched Stock Name
In C#, there is no mapping from value to key in dictionaries. Key to value mappings are smooth. Therefore, searching for the key will be time consuming.

I'll use the dictionaries to search for the key (the fetched values). This will give a list of values that represent the wanted values (most of the cases one value).

Saturday, 17 August 2019

StockAnalyzer: Handling Stock Name Mismathes

One early flaw of the earlier versions of my webscraper was that it used hard coded links when retrieving the stock data. The links were based on hard coded numbers and I needed a safeguard check to detect if those numbers changed.

The program took the first Name ("Eniro") as the stock name. It used the corresponding numbers (869, 2398 and 46100341) to build three URL strings to fetch the stock data.

For that reason, I had the stock name as the two first entries: The first name is the stock that I want to fetch and the second name is the stock that was actually collected. Ideally, the names matched each other, but there are several mismatches.

I need to verify the stock records before adding them to the database. One check will be to see whether the first string is "---Void---" or not. That string was a placeholder and that record shall be discarded.


Further, I need to check whether the stock data I have really matches the stock name or not. To do this, I'll use a XML file that contains three lists:
  • White List for the allowed combinations of names,
  • Grey List for the combinations of names that hasn't been verified by the user yet and
  • Black List for the illegal combinations of names (these records will be ignored by the program).


A lesson learnt from this is to make sure that the web scraped data is in a good shape already when scraping. The next blog post will cover some XML topics for CSharp.

Saturday, 10 August 2019

Garage Cleanup

My summer project at the summer cottage was to fix the garage. You can see the before images below.




 The black marks in the wall are probably from car decks. There is also a couple of holes in the wall.

 After throwing away some rubbish and emptying the garage, I was able to paint the surfaces. I got help with painting the kitchen shelves (they are from a French kitchen of the late 1970's).

I've mounted four connected IKEA Hejne shelves on the far wall. To avoid stains from the firewood, I had a tarp behind the wood.


 And two connected 50 cm IKEA Hejne shelves in the corner.

This took some time from my ordinary pet projects. I'll resume them now.