It's lunch time in Karlsruhe-Durlach
This program intents to help you with your choice which restaurant you want to go to for lunch time in Karlsruhe-Durlach.
The following perl modules are required:
- Config::General
- Encode
- File::Slurp
- File::Temp
- HTML::Template
- JSON
- Moo
- Modern::Perl
- URI
- Web::Scraper (debian packages: libconfig-general-perl libencode-perl libfile-slurp-perl libfile-temp-perl libhtml-template-perl libjson-perl libmoo-perl libmodern-perl-perl liburi-perl libweb-scraper-perl)
The following system binaries are required:
- /usr/bin/pdftotext (debian package: poppler-utils)
- /usr/bin/lessc (debian package: node-less; or npm install -g lessc)
mkdir out
lessc src/less/lunchtime.less out/lunchtime.css
bin/gather_json_data.pl && bin/fill_template.pl
Only Restaurants in Karlsruhe-Durlach are covered. To see the current list have a look in the config file.
The config file is written in perl's Config::General style. Every restaurant has to be enclosed with an unique tag like ....
Name of the restaurant which is displayed in the HTML output file.
URL to either an HTML or PDF location which should be parsed for lunchtime menus.
Either "html" or "pdf" has to be defined here. This changes the parsing, so wether pdf2txt or an HTML scraper is invoked.
If the type is "html" then you have to define xpaths where to find the "menu", "description" (optional) and "comment" (optional) from the given URL.
After lines matching this regular expression there is an new line injected in the output.
Here is room to write your own perl sub/function which is run over the parsed input just before writing the output to the HTML file.
- spacer line which inserts new line in front of given regex
- get pictures from xpaths like: .//div[@class='fbStarGrid']/div[1]/a (for Cafe Galerie, from: https://www.facebook.com/pages/Cafe-Galerie/181267271890905?sk=photos_stream&ref=page_internal)