how to convert points to lines in GeoPandas

created at 07-02-2021 views: 166

a few lines of code to realize the point-to-line function

In the GIS data processing operation, the general point-to-line operation is often performed.
Here is a way to implement point-to-line conversion using Python.

1. Materials

  • Pandas,
  • GeoPandas,
  • Shapely

Who knows the power of Pandas and GeoPandas
First look at the following point file, the format is a csv file with latitude and longitude coordinates. We first use pandas to present the data. The content is as follows:

import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString,Point

fp = r'E:\Dev\data\BigRoads31.csv'
df = pd.read_csv(fp)
df.iloc[:30,:]
 FIDNameStateSpeedNumLngLat
00fakeName13100108.888134.21166
11fakeName23100108.889734.21167
22fakeName32151108.856934.2016
33fakeName42151108.85734.2016
44fakeName52151108.859434.20164
55fakeName62151108.861934.20163
66fakeName72151108.863734.20161
77fakeName82151108.868434.20163
88fakeName92151108.869434.20163
99fakeName102252108.869434.20171
1010fakeName112252108.869334.20171
1111fakeName122252108.868434.2017
1212fakeName132252108.863634.20168
1313fakeName142252108.859434.2017
1414fakeName152252108.85734.20171
1515fakeName162252108.856934.2017
1616fakeName172253108.877334.21166
1717fakeName182253108.877334.21174
1818fakeName192253108.877334.21214
1919fakeName202253108.877334.21411
2020fakeName212253108.877234.21653
2121fakeName222253108.877234.22002
2222fakeName232253108.877334.2201
2323fakeName242254108.895134.23529
2424fakeName252254108.895134.23503
2525fakeName262254108.895134.23268
2626fakeName272254108.895134.23246
2727fakeName282254108.895134.23138
2828fakeName292254108.89534.22857
2929fakeName302254108.89534.22846

It can be seen that each row represents a point, and the last two columns of each row are coordinates. There are two other columns that we need to pay attention to. The FID column is the auto-increment ID of the data, and Num is a number. The subsequent dot-to-line will be based on these two columns.

Use Geopandas to print out the points, you can see the geographical distribution of the points, as shown in the figure below:

xy = [Point(xy) for xy in zip(df.Lng,df.Lat)]
pointDataFrame = gpd.GeoDataFrame(df,geometry=xy)
pointDataFrame.plot(figsize = (24, 24))
<matplotlib.axes._subplots.AxesSubplot at 0x1f5eeb3b390>

plot of points

Anyone familiar with the vector data structure of GIS knows that a line is made up of points, and its physical structure is a series of points arranged in an orderly manner. According to this feature, we will carry out the point-to-line operation.

2. Point to line

There are rules to make a circle. No, there are rules to connect to a line. By looking at the above table, we convert points to lines according to the following rules:

  1. Points with the same num value are merged into a line segment;
  2. The arrangement order of the points on the line segment is FID according to the arrangement order in the table

The conversion process is as follows:

# grouping
dataGroup = df.groupby('Num')

# build data
tb = []
geomList = []
for name,group in dataGroup:
    # Separate the attribute information, and take the first 5 columns of the first row of each group as the data attributes
    tb.append(group.iloc[0,:5])
    # Pack the same group of points into a list
    xyList = [xy for xy in zip(group.Lng, group.Lat)]

    line = LineString(xyList)
    geomList.append(line)

# Point to line
geoDataFrame = gpd.GeoDataFrame(tb, geometry = geomList)

Let's print the result to see

geoDataFrame.iloc[:20,:]
 FIDNameStateSpeedNumgeometry
00fakeName13100LINESTRING,(108.888107,34.21166229999999,,108....
22fakeName22151LINESTRING,(108.856926,34.2015953,,108.857033,...
99fakeName32252LINESTRING,(108.869408,34.2017097,,108.869308,...
1616fakeName42253LINESTRING,(108.877281,34.2116547,,108.877281,...
2323fakeName52254LINESTRING,(108.895088,34.2352905,,108.895073,...
3131fakeName62205LINESTRING,(108.900749,34.2257996,,108.900589,...
3838fakeName71406LINESTRING,(108.781281,33.8055725,,108.781242,...
29142914fakeName81457LINESTRING,(108.8955,34.19851679999999,,108.89...
31313131fakeName91358LINESTRING,(108.895638,34.1985016,,108.89666,3...
31743174fakeName101309LINESTRING,(108.875626,34.1899567,,108.87558,3...
31803180fakeName1113010LINESTRING,(108.875481,34.19846339999999,,108....
31873187fakeName1213511LINESTRING,(108.881477,34.1797485,,108.881508,...
32293229fakeName1313512LINESTRING,(108.890282,34.19849779999999,,108....
32673267fakeName1413513LINESTRING,(108.874435,34.2050018,,108.874565,...
33393339fakeName1513514LINESTRING,(108.945831,34.19813920000001,,108....
34083408fakeName1613515LINESTRING,(108.855911,34.1825256,,108.855934,...
34303430fakeName1714016LINESTRING,(108.857033,34.2045937,,108.856934,...
34513451fakeName1813517LINESTRING,(108.841652,34.1938324,,108.841728,...
34823482fakeName1914018LINESTRING,(108.864204,34.1888275,,108.86409,3...
34973497fakeName2013519LINESTRING,(108.863541,34.20478060000001,,108....

As you can see, the data has been grouped by Num and merged into a LineString object.

plot of lines

3. Save the result as a geojson file

  1. Save as geojson, two methods
# method 1
fp = r"E:\Dev\data\lineRoads.geojson"
geoDataFrame.to_file(fp, driver='GeoJSON', encoding="utf-8")

# method 2
json = geoDataFrame.to_json()
with open(fp,'w') as f:
    f.write(json)
  1. Save as shp
shp = r"E:\Dev\data\lineRoads.shp"
geoDataFrame.to_file(shp,driver="ESRI Shapefile",encoding="utf-8")

4. Point to line complete code

Excluding some of the test verification code above, the point-to-line function can be realized in just a few lines. Good! ! !

import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString,Point


def main():
    fp = r'E:\Dev\data\BigRoads31.csv'
    df = pd.read_csv(fp)

    #Group
    dataGroup = df.groupby('Num')

    #data Construction 
    tb = []
    geomList = []
    for name,group in dataGroup:
        # Separate the attribute information, and take the first 5 columns of the first row of each group as the data attributes
        tb.append(group.iloc[0,:5])
       # Pack the same group of points into a list
        xyList = [xy for xy in zip(group.Lng, group.Lat)]

        line = LineString(xyList)
        geomList.append(line)

    # Point to line
    geoDataFrame = gpd.GeoDataFrame(tb, geometry = geomList)

    fp = r"E:\Dev\data\lineRoads.geojson"
    geoDataFrame.to_file(fp, driver='GeoJSON', encoding="utf-8")

if __name__ == '__main__':
    main()
created at:07-02-2021
edited at: 07-03-2021: