{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 40 Working with GPX Data in GemGIS\n", "\n", "GPX, or GPS Exchange Format, is an XML schema designed as a common GPS data format for software applications. It can be used to describe waypoints, tracks, and routes. The format is open and can be used without the need to pay license fees. Location data (and optionally elevation, time, and other information) is stored in tags and can be interchanged between GPS devices and software. Common software applications for the data include viewing tracks projected onto various map sources, annotating maps, and geotagging photographs based on the time they were taken.\n", "\n", "\n", "\n", "Source: https://en.wikipedia.org/wiki/GPS_Exchange_Format" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set File Paths and download Tutorial Data\n", "\n", "If you downloaded the latest `GemGIS` version from the Github repository, append the path so that the package can be imported successfully. Otherwise, it is recommended to install `GemGIS` via `pip install gemgis` and import `GemGIS` using `import gemgis as gg`. In addition, the file path to the folder where the data is being stored is set. The tutorial data is downloaded using Pooch (https://www.fatiando.org/pooch/latest/index.html) and stored in the specified folder. Use `pip install pooch` if Pooch is not installed on your system yet." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-03-17T12:03:13.671689Z", "start_time": "2021-03-17T12:03:11.349579Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\ale93371\\Anaconda3\\envs\\gemgis\\lib\\site-packages\\gemgis\\gemgis.py:27: UserWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas will still use PyGEOS by default for now. To force to use and test Shapely 2.0, you have to set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:\n", "\n", "import os\n", "os.environ['USE_PYGEOS'] = '0'\n", "import geopandas\n", "\n", "In a future release, GeoPandas will switch to using Shapely by default. If you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).\n", " import geopandas as gpd\n" ] } ], "source": [ "import gemgis as gg\n", "\n", "file_path ='data/40_working_with_gpx_data_in_gemgis/'" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-03-17T12:03:14.093507Z", "start_time": "2021-03-17T12:03:13.704939Z" } }, "outputs": [], "source": [ "gg.download_gemgis_data.download_tutorial_data(filename=\"40_working_with_gpx_data_in_gemgis.zip\", dirpath=file_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Data\n", "\n", "Data from a running practice in northern Germany is used for demonstration purposes." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:44.943487Z", "start_time": "2021-01-04T08:23:42.139586Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import geopandas as gpd\n", "\n", "gpx = gg.vector.load_gpx(path=file_path+'Run.gpx', layer='tracks') \n", "gpx" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inspecting the data\n", "\n", "The driver used to open the data was ``GPX``" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:44.959583Z", "start_time": "2021-01-04T08:23:44.945482Z" } }, "outputs": [ { "data": { "text/plain": [ "'GPX'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.driver" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The CRS of the data is ``EPGS:4326``." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:44.974844Z", "start_time": "2021-01-04T08:23:44.961561Z" } }, "outputs": [ { "data": { "text/plain": [ "{'init': 'epsg:4326'}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.crs" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:44.989999Z", "start_time": "2021-01-04T08:23:44.976868Z" } }, "outputs": [ { "data": { "text/plain": [ "'GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AXIS[\"Latitude\",NORTH],AXIS[\"Longitude\",EAST],AUTHORITY[\"EPSG\",\"4326\"]]'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.crs_wkt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The extent of the data is defining the bounds of the gpx." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.022014Z", "start_time": "2021-01-04T08:23:44.992008Z" } }, "outputs": [ { "data": { "text/plain": [ "(8.460906, 52.694879, 8.501507, 52.732331)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.bounds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since the track has different start and end points, the track is not closed." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.037965Z", "start_time": "2021-01-04T08:23:45.024022Z" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.closed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Accessing the meta data." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.052960Z", "start_time": "2021-01-04T08:23:45.038967Z" } }, "outputs": [ { "data": { "text/plain": [ "{'driver': 'GPX',\n", " 'schema': {'properties': OrderedDict([('name', 'str'),\n", " ('cmt', 'str'),\n", " ('desc', 'str'),\n", " ('src', 'str'),\n", " ('link1_href', 'str'),\n", " ('link1_text', 'str'),\n", " ('link1_type', 'str'),\n", " ('link2_href', 'str'),\n", " ('link2_text', 'str'),\n", " ('link2_type', 'str'),\n", " ('number', 'int'),\n", " ('type', 'str')]),\n", " 'geometry': 'MultiLineString'},\n", " 'crs': {'init': 'epsg:4326'},\n", " 'crs_wkt': 'GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AXIS[\"Latitude\",NORTH],AXIS[\"Longitude\",EAST],AUTHORITY[\"EPSG\",\"4326\"]]'}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.meta" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2021-01-04T07:44:55.388142Z", "start_time": "2021-01-04T07:44:55.379618Z" } }, "source": [ "Name of the Track." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.068971Z", "start_time": "2021-01-04T08:23:45.054962Z" } }, "outputs": [ { "data": { "text/plain": [ "'tracks'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.name" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.084150Z", "start_time": "2021-01-04T08:23:45.070971Z" } }, "outputs": [ { "data": { "text/plain": [ "{'driver': 'GPX',\n", " 'schema': {'properties': OrderedDict([('name', 'str'),\n", " ('cmt', 'str'),\n", " ('desc', 'str'),\n", " ('src', 'str'),\n", " ('link1_href', 'str'),\n", " ('link1_text', 'str'),\n", " ('link1_type', 'str'),\n", " ('link2_href', 'str'),\n", " ('link2_text', 'str'),\n", " ('link2_type', 'str'),\n", " ('number', 'int'),\n", " ('type', 'str')]),\n", " 'geometry': 'MultiLineString'},\n", " 'crs': {'init': 'epsg:4326'},\n", " 'crs_wkt': 'GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AXIS[\"Latitude\",NORTH],AXIS[\"Longitude\",EAST],AUTHORITY[\"EPSG\",\"4326\"]]'}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx.profile" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2021-01-04T07:44:43.067887Z", "start_time": "2021-01-04T07:44:43.050881Z" } }, "source": [ "## Loading GPX as dict\n", "\n", "The GPX can also be loaded as dict for further processing of the contents of the GPX file using ``load_gpx_as_dict(..)``. This dict contains the properties, the geometry including the coordinates of the data, the ID and the type of the data" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.147166Z", "start_time": "2021-01-04T08:23:45.086155Z" } }, "outputs": [ { "data": { "text/plain": [ "[(8.496285, 52.705566),\n", " (8.49627, 52.705593),\n", " (8.496234, 52.705629),\n", " (8.496205, 52.705664),\n", " (8.496181, 52.705705)]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx_dict = gg.vector.load_gpx_as_dict(path=file_path+'Run.gpx', layer='tracks')\n", "gpx_dict['geometry']['coordinates'][0][:5]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.163172Z", "start_time": "2021-01-04T08:23:45.149166Z" } }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['type', 'id', 'properties', 'geometry'])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx_dict.keys()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.210182Z", "start_time": "2021-01-04T08:23:45.196179Z" } }, "outputs": [ { "data": { "text/plain": [ "('Feature',\n", " '0',\n", " OrderedDict([('name', 'First half marathon distance of the year'),\n", " ('cmt', None),\n", " ('desc', None),\n", " ('src', None),\n", " ('link1_href', None),\n", " ('link1_text', None),\n", " ('link1_type', None),\n", " ('link2_href', None),\n", " ('link2_text', None),\n", " ('link2_type', None),\n", " ('number', None),\n", " ('type', '9')]))" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx_dict['type'], gpx_dict['id'], gpx_dict['properties']" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.258193Z", "start_time": "2021-01-04T08:23:45.244189Z" } }, "outputs": [ { "data": { "text/plain": [ "'MultiLineString'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpx_dict['geometry']['type']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating Shapely Base Geometry from GPX\n", "\n", "In order to work with GPX data, a Shapely BaseGeometry can be created using ``load_gpx_as_geometry(...)``.\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.352951Z", "start_time": "2021-01-04T08:23:45.307204Z" } }, "outputs": [ { "data": { "image/svg+xml": [ "" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape = gg.vector.load_gpx_as_geometry(path=file_path+'Run.gpx', layer='tracks')\n", "shape" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.384968Z", "start_time": "2021-01-04T08:23:45.354950Z" } }, "outputs": [ { "data": { "text/plain": [ "'MULTILINESTRING ((8.496285 52.705566, 8.49627 52.705593, 8.496234 52.705629, 8.496205 52.705664, 8.4'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape.wkt[:100]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating GeoData from Geometry\n", "\n", "A GeoDataFrame containing the created geometry can easily be created. Notice that the CRS attribute of the GPX collection was provided. " ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.448442Z", "start_time": "2021-01-04T08:23:45.386957Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometry
0MULTILINESTRING ((8.49629 52.70557, 8.49627 52...
\n", "
" ], "text/plain": [ " geometry\n", "0 MULTILINESTRING ((8.49629 52.70557, 8.49627 52..." ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import geopandas as gpd\n", "\n", "gdf = gpd.GeoDataFrame(geometry=[shape], crs=gpx.crs)\n", "gdf" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.479430Z", "start_time": "2021-01-04T08:23:45.450433Z" } }, "outputs": [ { "data": { "text/plain": [ "\n", "Name: WGS 84\n", "Axis Info [ellipsoidal]:\n", "- lon[east]: Longitude (degree)\n", "- lat[north]: Latitude (degree)\n", "Area of Use:\n", "- name: World.\n", "- bounds: (-180.0, -90.0, 180.0, 90.0)\n", "Datum: World Geodetic System 1984 ensemble\n", "- Ellipsoid: WGS 84\n", "- Prime Meridian: Greenwich" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf.crs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And the data can be plotted." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.636062Z", "start_time": "2021-01-04T08:23:45.483433Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "gdf.plot()\n", "plt.grid()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The length of the track is approximately 21139 m and therefore slightly longer than a half marathon distance (21097 m)." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T08:23:45.730333Z", "start_time": "2021-01-04T08:23:45.639073Z" } }, "outputs": [ { "data": { "text/plain": [ "21139.875974842187" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf.to_crs(4647).loc[0].geometry.length" ] } ], "metadata": { "hide_input": false, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }