dtoolkit.transformer.GeoKMeans#
- class dtoolkit.transformer.GeoKMeans(n_clusters=8, *, init='k-means++', n_init='auto', max_iter=300, tol=0.0001, verbose=0, random_state=None, copy_x=True, algorithm='elkan')[source]#
Spatial K-Means clustering.
The distance is calculated by haversine formula. Parameters and attributes are the same as
sklearn.cluster.KMeans.Methods
- Raises:
- ValueError
If the input is not in the form of
[(longitude, latitude)].
See also
sklearn.cluster.KMeansOriginal implementation of K-Means clustering.
Notes
algorithmis fixed to"elkan". Because only elkan algorithm can support custom distance.Examples
>>> from dtoolkit.transformer import GeoKMeans >>> X = [ ... [113.615822, 37.844797], ... [113.586288, 37.917018], ... [113.630711, 37.865369], ... [113.590684, 37.948056], ... [113.631483, 37.862634], ... [113.57413, 37.968669], ... [113.663159, 37.848446], ... [113.586941, 37.868116], ... [113.679381, 37.875028], ... [113.5706, 37.973542], ... [113.585504, 37.879261], ... [113.584412, 37.935521], ... [113.575964, 37.906472], ... [113.593658, 37.848911], ... [113.633605, 37.869107], ... [113.582298, 37.857025], ... [113.629378, 37.805196], ... [113.48768, 37.872603], ... [113.477766, 37.868846], ... ] >>> geokmeans = GeoKMeans(n_clusters=2, random_state=0).fit(X) >>> geokmeans.labels_ array([0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0], dtype=int32) >>> geokmeans.cluster_centers_ array([[113.59979892, 37.85887223], [113.58034633, 37.94154633]])