An analytic to help infer movement patterns from large amounts of geo-temporal data in a cloud environment.
A collection of independent entries that represent an identified object's geographic location at a given point in time.
Key Data Fields [ ID, TIMESTAMP, LATITUDE, LONGITUDE ]
Specific formatting and analytic tool configurations for using your own data set(s) is provided within the wiki.
- Infers movement patterns based on given geo-temporal data and build tracks (or paths) of movement for each unique object in your collection.
- Create data tables that aggregate information regarding track activity within an area, average velocity and movement of tracks, and average direction/bearing of tracks.
In order to utilize your own data sets, some knowledge of the following aspects will be required:
- Cloudera CDH 5.13.1, Hadoop {streaming}
- Apache Hive
- Python programming language + gmpy2
To run the example, execute run_ais.sh
found in {project-root}/hive-streaming
.
This script will unpack the sample data, upload it to the Hadoop filesystem, enter it into Hive, and run the Aggregate Micro Pathing algorithm. When completed, it will also pull down the finished count data from Hive and place it locally into a .csv file located in the {project-root}/hive-streaming/output
directory.
For detailed instructions, go to the wiki.