meds-processor ? guide to C# and .NET CoreThis is the perfect place to start learning C# and .NET Core by building something real with looks and feels to it - a drug list data scraper and a secure documented REST web API. This project is designed for developers who have moderate programming experience and some experience in building web apps but have not still encuntered C# and .NET on the backend.
ValidFrom.Year < 2019, removed parallelization of parser due to file lock situations, cleaned up a bit the API responses, removed unnecessary code.ValidFrom.Year < 2019 expression. I will be fixing this with the additional parser (or other fix) and updating the blog posts!The cross-platform production ready SDK is .NET Core and the version used to build this application is "version": "2.2.402". You can find the SDK downloads for your OS here.
Build the application (and ensure internet connectivity for NuGet packages to restore) with:
> cd src/MedsProcessor.WebAPI
MedsProcessor.WebAPI> dotnet buildRun the application (and ensure internet connectivity for web scraper to work) on https://localhost:5001 with:
> cd src/MedsProcessor.WebAPI
MedsProcessor.WebAPI> dotnet runYou can now browse the Web API via a Swagger UI on the address: https://localhost:5001/swagger/index.html
The image below is a screenshot of the Swagger UI which is produced to document the Web API with available endpoints and their respected HTTP methods.
I was irritated by the fact that my country's health insurance fund realeases important data such as medicine and drugs in such an unstructured and user unfriendly format. Also, I figured I am a bit rusty with .NET Core and writing technical blogs.
Learn to build a web scraper, downloader & Excel parser by digging through some hiddious spreadsheet data of Croatia's Health Insurance Fund and its primary and supplementary list of drugs and all that by using only C# and .NET Core (on any modern computer OS platform)! The .NET Core SDK can be installed and used the same on Windows, OSX or Linux.
The repository is composed of four parts. Those parts are git branches where each of them has their own blog post article. You can browse through branches here on GitHub (the branch selection dropdown). I advise that you start by reading the blog part/1 as it will guide you through building the solution on your own. You can use any modern OS and code editor.
part/1 (Practical .NET Core — write a web scraper, downloader & Excel parser. Part 1: Scraper)
AngleSharp library to fetch some remote HTML pages and extract some links.part/2 (Practical .NET Core — write a scraper, fetcher & xls(x) parser. Part 2: Downloader)
Task Parallel Library and process async Tasks in .NET Core.part/3 (Practical .NET Core — write a scraper, fetcher & .xls(x) parser. Part 3: Parser)
NPOI spreadsheets parsing library to extract relevant data for your C# model classes. Upon finishing you will have a single dataset of transformed and organized data.part/4 (Practical .NET Core — write a scraper, fetcher & .xls(x) parser. Part 4: Secure REST web API)
Swagger docs based upon your well documented Controllers and Actions.The source has changed a lot through the parts. There might be bugs as this project is not covered by tests (something which I might consider in future). Part/4 besides the Web API implementation goes into refactoring and improving some parts which are on purpose not in their best form from previous parts. What you will first notice is that this README.md document is not in this final form on the first three branches. Don't get discouraged, rather notify me if you see place for improvement. Everything should work as expected if you follow the blog series. Also, all the practices you see here such as a base class for a HTTP response to carry HTTP header data are not the best production to use thing. So yeah, always stay curious, ask your self "Why?", rethink your approach and then execute.
I am open to improvements, comments, issues, forks/PRs and everything of good concern and idea. We can also discuss your ideas and topics in the comments section on the blog post articles if you prefer that.
Vedran Mandić.
MIT License