This is the best calendar_table with the most columns of interesting date dimensions you'll ever find!
- year number, month number, day number, yearmonth, yearquarter
- month name, day name, full date w ordinal suffix (e.g., "January 3rd, 2012")
- etc.
Additionally, some interesting and unique data elements include:
- holiday identification including Easter in any year (including pre-1583)
- moon phase identification
- sunrise/sunset for UTC as well as user-provided lat/lon coordinates
- length of daylight / darkness each day/evening
For more information about the date dimensions and fields available, see these supporting documents:
- Documentation: Full column list with datatypes and descriptions
- Documentation: Sample output (download and view in a CSV viewer like Excel)
- PROCESS FLOW CONTROLS
Check today's date. If it is a Sunday or the last day of the month, then run a code block. - SLICING DATA FOR ANALYSIS
From the revenue dataset, show me all sales that fall between Thanksgiving and Christmas, and determine which days be best for best door- buster promotions - FEATURES FOR MACHINE LEARNING MODELS
Using a polynomial linear regression, predict the end-of-month sales using the month-to-date sales along with the knowledge of what percent through the month the data is, comparing to similar data from prior months. - DATA VISUALIZATION
In Microsoft PowerBI, or in Excel, or in Tableau, or in any tool... Use the workday of the month as the x-axis in a bar chart that shows the change in time clocked by employees on a given project in a given month - AND MORE?
Send me more examples to add!
- Example CSV output is provided in git repo
- When this code is run for a span of 5 years:
- ~2,200 rows of data are created with ~110 columns (~240k cells)
- the resulting CSV's filesize is ~1.5 MiB
- the script takes ~28 seconds on my raspberry pi 4B (very low specs)
- code will run much faster on a modern laptop or desktop
- No special import statements needed besides standard Python 3.7+ packages
- 32-bit OS and hardware may cause limitations in the ability to generate table data for years far in the future (2040) and throw "OverflowError: timestamp out of range for platform time_t"; no limitation has been observed on 64-bit systems.
- modularize codes with classes/functions
- use config file to set main params (start, end, holiday rules, lat/lon for localized data like moon phase, sunrise, etc.)
- add columns
- percent through w / ym / y / etc
- add season based on official start/end of seasons? (equinox, etc)
- fix bugs in the following:
- night duration utc / night duration local