In the GIS data processing operation, the general point-to-line operation is often performed.
Here is a way to implement point-to-line conversion using Python.
Who knows the power of Pandas and GeoPandas
First look at the following point file, the format is a csv file with latitude and longitude coordinates. We first use pandas to present the data. The content is as follows:
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString,Point
fp = r'E:\Dev\data\BigRoads31.csv'
df = pd.read_csv(fp)
df.iloc[:30,:]
FID | Name | State | Speed | Num | Lng | Lat | |
0 | 0 | fakeName1 | 3 | 10 | 0 | 108.8881 | 34.21166 |
1 | 1 | fakeName2 | 3 | 10 | 0 | 108.8897 | 34.21167 |
2 | 2 | fakeName3 | 2 | 15 | 1 | 108.8569 | 34.2016 |
3 | 3 | fakeName4 | 2 | 15 | 1 | 108.857 | 34.2016 |
4 | 4 | fakeName5 | 2 | 15 | 1 | 108.8594 | 34.20164 |
5 | 5 | fakeName6 | 2 | 15 | 1 | 108.8619 | 34.20163 |
6 | 6 | fakeName7 | 2 | 15 | 1 | 108.8637 | 34.20161 |
7 | 7 | fakeName8 | 2 | 15 | 1 | 108.8684 | 34.20163 |
8 | 8 | fakeName9 | 2 | 15 | 1 | 108.8694 | 34.20163 |
9 | 9 | fakeName10 | 2 | 25 | 2 | 108.8694 | 34.20171 |
10 | 10 | fakeName11 | 2 | 25 | 2 | 108.8693 | 34.20171 |
11 | 11 | fakeName12 | 2 | 25 | 2 | 108.8684 | 34.2017 |
12 | 12 | fakeName13 | 2 | 25 | 2 | 108.8636 | 34.20168 |
13 | 13 | fakeName14 | 2 | 25 | 2 | 108.8594 | 34.2017 |
14 | 14 | fakeName15 | 2 | 25 | 2 | 108.857 | 34.20171 |
15 | 15 | fakeName16 | 2 | 25 | 2 | 108.8569 | 34.2017 |
16 | 16 | fakeName17 | 2 | 25 | 3 | 108.8773 | 34.21166 |
17 | 17 | fakeName18 | 2 | 25 | 3 | 108.8773 | 34.21174 |
18 | 18 | fakeName19 | 2 | 25 | 3 | 108.8773 | 34.21214 |
19 | 19 | fakeName20 | 2 | 25 | 3 | 108.8773 | 34.21411 |
20 | 20 | fakeName21 | 2 | 25 | 3 | 108.8772 | 34.21653 |
21 | 21 | fakeName22 | 2 | 25 | 3 | 108.8772 | 34.22002 |
22 | 22 | fakeName23 | 2 | 25 | 3 | 108.8773 | 34.2201 |
23 | 23 | fakeName24 | 2 | 25 | 4 | 108.8951 | 34.23529 |
24 | 24 | fakeName25 | 2 | 25 | 4 | 108.8951 | 34.23503 |
25 | 25 | fakeName26 | 2 | 25 | 4 | 108.8951 | 34.23268 |
26 | 26 | fakeName27 | 2 | 25 | 4 | 108.8951 | 34.23246 |
27 | 27 | fakeName28 | 2 | 25 | 4 | 108.8951 | 34.23138 |
28 | 28 | fakeName29 | 2 | 25 | 4 | 108.895 | 34.22857 |
29 | 29 | fakeName30 | 2 | 25 | 4 | 108.895 | 34.22846 |
It can be seen that each row represents a point, and the last two columns of each row are coordinates. There are two other columns that we need to pay attention to. The FID column is the auto-increment ID of the data, and Num is a number. The subsequent dot-to-line will be based on these two columns.
Use Geopandas to print out the points, you can see the geographical distribution of the points, as shown in the figure below:
xy = [Point(xy) for xy in zip(df.Lng,df.Lat)]
pointDataFrame = gpd.GeoDataFrame(df,geometry=xy)
pointDataFrame.plot(figsize = (24, 24))
<matplotlib.axes._subplots.AxesSubplot at 0x1f5eeb3b390>
Anyone familiar with the vector data structure of GIS knows that a line is made up of points, and its physical structure is a series of points arranged in an orderly manner. According to this feature, we will carry out the point-to-line operation.
There are rules to make a circle. No, there are rules to connect to a line. By looking at the above table, we convert points to lines according to the following rules:
The conversion process is as follows:
# grouping
dataGroup = df.groupby('Num')
# build data
tb = []
geomList = []
for name,group in dataGroup:
# Separate the attribute information, and take the first 5 columns of the first row of each group as the data attributes
tb.append(group.iloc[0,:5])
# Pack the same group of points into a list
xyList = [xy for xy in zip(group.Lng, group.Lat)]
line = LineString(xyList)
geomList.append(line)
# Point to line
geoDataFrame = gpd.GeoDataFrame(tb, geometry = geomList)
Let's print the result to see
geoDataFrame.iloc[:20,:]
FID | Name | State | Speed | Num | geometry | |
0 | 0 | fakeName1 | 3 | 10 | 0 | LINESTRING,(108.888107,34.21166229999999,,108.... |
2 | 2 | fakeName2 | 2 | 15 | 1 | LINESTRING,(108.856926,34.2015953,,108.857033,... |
9 | 9 | fakeName3 | 2 | 25 | 2 | LINESTRING,(108.869408,34.2017097,,108.869308,... |
16 | 16 | fakeName4 | 2 | 25 | 3 | LINESTRING,(108.877281,34.2116547,,108.877281,... |
23 | 23 | fakeName5 | 2 | 25 | 4 | LINESTRING,(108.895088,34.2352905,,108.895073,... |
31 | 31 | fakeName6 | 2 | 20 | 5 | LINESTRING,(108.900749,34.2257996,,108.900589,... |
38 | 38 | fakeName7 | 1 | 40 | 6 | LINESTRING,(108.781281,33.8055725,,108.781242,... |
2914 | 2914 | fakeName8 | 1 | 45 | 7 | LINESTRING,(108.8955,34.19851679999999,,108.89... |
3131 | 3131 | fakeName9 | 1 | 35 | 8 | LINESTRING,(108.895638,34.1985016,,108.89666,3... |
3174 | 3174 | fakeName10 | 1 | 30 | 9 | LINESTRING,(108.875626,34.1899567,,108.87558,3... |
3180 | 3180 | fakeName11 | 1 | 30 | 10 | LINESTRING,(108.875481,34.19846339999999,,108.... |
3187 | 3187 | fakeName12 | 1 | 35 | 11 | LINESTRING,(108.881477,34.1797485,,108.881508,... |
3229 | 3229 | fakeName13 | 1 | 35 | 12 | LINESTRING,(108.890282,34.19849779999999,,108.... |
3267 | 3267 | fakeName14 | 1 | 35 | 13 | LINESTRING,(108.874435,34.2050018,,108.874565,... |
3339 | 3339 | fakeName15 | 1 | 35 | 14 | LINESTRING,(108.945831,34.19813920000001,,108.... |
3408 | 3408 | fakeName16 | 1 | 35 | 15 | LINESTRING,(108.855911,34.1825256,,108.855934,... |
3430 | 3430 | fakeName17 | 1 | 40 | 16 | LINESTRING,(108.857033,34.2045937,,108.856934,... |
3451 | 3451 | fakeName18 | 1 | 35 | 17 | LINESTRING,(108.841652,34.1938324,,108.841728,... |
3482 | 3482 | fakeName19 | 1 | 40 | 18 | LINESTRING,(108.864204,34.1888275,,108.86409,3... |
3497 | 3497 | fakeName20 | 1 | 35 | 19 | LINESTRING,(108.863541,34.20478060000001,,108.... |
As you can see, the data has been grouped by Num and merged into a LineString object.
# method 1
fp = r"E:\Dev\data\lineRoads.geojson"
geoDataFrame.to_file(fp, driver='GeoJSON', encoding="utf-8")
# method 2
json = geoDataFrame.to_json()
with open(fp,'w') as f:
f.write(json)
shp = r"E:\Dev\data\lineRoads.shp"
geoDataFrame.to_file(shp,driver="ESRI Shapefile",encoding="utf-8")
Excluding some of the test verification code above, the point-to-line function can be realized in just a few lines. Good! ! !
import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString,Point
def main():
fp = r'E:\Dev\data\BigRoads31.csv'
df = pd.read_csv(fp)
#Group
dataGroup = df.groupby('Num')
#data Construction
tb = []
geomList = []
for name,group in dataGroup:
# Separate the attribute information, and take the first 5 columns of the first row of each group as the data attributes
tb.append(group.iloc[0,:5])
# Pack the same group of points into a list
xyList = [xy for xy in zip(group.Lng, group.Lat)]
line = LineString(xyList)
geomList.append(line)
# Point to line
geoDataFrame = gpd.GeoDataFrame(tb, geometry = geomList)
fp = r"E:\Dev\data\lineRoads.geojson"
geoDataFrame.to_file(fp, driver='GeoJSON', encoding="utf-8")
if __name__ == '__main__':
main()