Changelog#

Version 0.0.20 (2022-12-30)#

Highlights of this release:

Hightly support H3 (Hexagonal hierarchical geospatial indexing system) via .to_h3 and .H3.*.

>>> import dtoolkit.geoaccessor
>>> import pandas as pd
>>> df = pd.DataFrame({"x": [122, 100], "y": [55, 1]}).from_xy('x', 'y', crs=4326)
>>> df
     x   y                    geometry
0  122  55  POINT (122.00000 55.00000)
1  100   1   POINT (100.00000 1.00000)

# GeoDataFrame -> h3 cell

>>> df_with_h3 = df.to_h3(8)
>>> df_with_h3
                      x   y                    geometry
612845052823076863  122  55  POINT (122.00000 55.00000)
614269156845420543  100   1   POINT (100.00000 1.00000)

# Calculate h3 cell area

>>> df_with_h3.h3.area
612845052823076863    710781.770906
614269156845420543    852134.191671
dtype: float64

# h3 cell -> GeoDataFrame

>>> df_parent_cell = df_with_h3.h3.to_parent()
>>> df_parent_cell
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00000 55.00000)
609765557230632959  100   1   POINT (100.00000 1.00000)
>>> df_parent_cell.h3.to_points()
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00991 55.00606)
609765557230632959  100   1   POINT (100.00504 0.99852)

New features and improvements:

Small bug-fix:

API changes:

Documentation:

  • PR#802: Reorder methods via function first then name.

  • PR#808: Mark Series dtype.

Maintenance development:

  • PR#774: pre-commit hooks autoupdate.

  • PR#798: Remove pygeos dependency from dtoolkit.

  • PR#805: Remove ci/env/311-latest-shapely2.yaml.

  • PR#806: Compat pandas 2.x.

  • PR#810: Remove dtoolkit.accessor.series._getattr_helper.py.

  • PR#812: Add blank lines.

  • PR#813: Remove 0.0.19 version warning information.

  • PR#818: Simplify import shapely object from shapely.geometry import xxx -> from shapely import xxx.

Version 0.0.19 (2022-12-11)#

Highlights of this release:

New features and improvements:

Small bug-fix:

  • PR#576: Fix DataFrame.append’s FutureWarning.

  • PR#765: Fix sklearn pipeline visualization can’t print OneHotEncoder.

  • PR#776: After v0.0.17 github release page don’t have tarball file anymore.

API changes:

  • PR#762: Drop columns arguments for error_report.

Documentation:

  • PR#755: Update installtation documentation.

  • PR#766: Some patches to documentation.

Maintenance development:

Version 0.0.18 (2022-10-14)#

New features and improvements:

API changes:

Small bug-fix:

Documentation:

Maintenance development:

Version 0.0.17 (2022-8-15)#

Highlights of this release:

New features and improvements:

Small bug-fix:

  • Avoid GeoDataFrame constructor mutating the original (inputting) DataFrame (PR#644).

  • Avoid fillna_regression() mutating the original dataframe (PR#622).

  • Compat with sklearn 1.2 stricter class parameters checking (PR#602).

  • geobuffer() uses the active geometry to generate buffers (PR#583).

  • Hook accessor method’s attrs into both class and instance (PR#580).

API changes:

  • Add deprecated warning for utm_crs() (PR#637, PR#645).

  • Remove warning message and drop inplace option (PR#555).

  • Use positional-only arguments (/) to limit name (PR#435).

Documentation:

  • Add Raises part for documentation (PR#623).

  • Apply singular file name style to /doc/* (PR#613).

  • Remove title ‘.dev0’ and ‘.post0’ suffixes (PR#587).

  • Beautify the format of inputting dictionary (PR#577).

Maintenance development:

  • Set timeout for updating versioneer CI (PR#657).

  • drop_inf/get_inf_range returns set instead of list (PR#656).

  • Remove ‘fkirc/skip-duplicate-actions’ (PR#655).

  • Rename arguments of methods (PR#647).

  • Remove ‘geopy’ from *-minmal.yaml env (PR#621).

  • Use cut as bin()’s alias (PR#619).

  • Use topn as top_n()’s alias (PR#617).

  • Follow Series.nlargest(n=5, keep='first') API (PR#616).

  • Follow numpy.repeat(repeats, axis) API (PR#615).

  • Set only positional parameter (/) for (geo)accessor (PR#612).

  • Add environment.yaml at root path for user (PR#611).

  • Use pandas.testing.assert_*_equal replace (Series|DataFrame).equals in testing (PR#607, PR#608).

  • Use function style rather than OOP (PR#606, PR#633, PR#648, PR#653).

  • Singular style file name (PR#605).

  • Correct file name (PR#604).

  • Rename yaml file *.yml -> *.yaml (PR#603).

  • (Geo)DataFrame geoaccessor don’t return (Geo)Series anymore (PR#601).

  • Set default coding style via EditorConfig (PR#600).

  • Suit actions/setup-python@v4 new changing (PR#581).

  • pre-commit hooks autoupdate (PR#579, PR#595, PR#610, PR#627, PR#634, PR#639).

  • Autoupdate actions (PR#578, PR#592, PR#628).

  • Move dtoolkit.transformer.pipeline into dtoolkit.pipeline (PR#563).

Typing annotations:

  • Use Hashable replace str | int (PR#582).

  • Use Literal (PR#505).

Version 0.0.16 (2022-5-30)#

New features and improvements:

API changes:

Documentation:

  • Adjust the sequence of methods (PR#565).

  • Index ._decorator and _exception method (PR#532).

Maintenance development:

Version 0.0.15 (2022-5-13)#

New features and improvements:

API changes:

  • Add version information for warning (PR#528).

  • Add DeprecationWarning for dropping axis option of filter_in() (PR#522).

  • Add DeprecationWarning for dropping generic package (PR#521).

  • Add DeprecationWarning for dropping inplace option of filter_in() (PR#519).

  • Drop unique_counts() method (PR#512).

Maintenance development:

  • Add single quote via !r for f-string (PR#520).

  • Add changelog link at PyPI page (PR#517).

  • Use build new distuils and add pyproject.toml configuration (PR#516).

  • pre-commit hooks autoupdate (PR#515, PR#524).

  • Remove warning message (PR#513, PR#514, PR#526).

Version 0.0.14 (2022-5-1)#

New features and improvements:

API changes:

Documentation:

  • Add top_n()’s new example about returning values (PR#489).

  • Adjust API reference sequences (PR#478).

Maintenance development:

Version 0.0.13 (2022-4-2)#

New features and improvements:

API changes:

  • Array in array out (PR#460).

  • OneHotEncoder’s fit_transform use inputting’s index (PR#458).

  • Let Pipeline’s fit_transform supports Series (PR#457).

  • Drop dtoolkit.transformer.MinMaxScaler(PR#451).

Small bug-fix:

  • Fix jupyter notebook can’t render (PR#438).

Documentation:

  • Rename sphinx project name from ‘dtoolkit’ to ‘my data toolkit’ (PR#454).

  • Add ‘feature’ section at documentation homepage (PR#452).

Maintenance development:

Typing annotations:

  • Specific None type using (PR#467).

  • Specific None type (PR#466).

  • Add Number and IntOrStr annotation constants (PR#465).

Version 0.0.12 (2022-2-11)#

Highlights of this release:

  • Specific pandas minimal version to each python version (PR#440).

  • One column data pipeline supports return Series (PR#431).

API changes:

  • Add DeprecationWarning for dtoolkit.transformer.MinMaxScaler (PR#449).

Documentation:

Maintenance development:

Version 0.0.11 (2022-1-25)#

New features and improvements:

  • Simplify OneHotEncoder examples and inputs (PR#434).

  • FeatureUnion would merge all into one DataFrame and the index would use the common part (PR#433).

Small bug-fix:

  • Fix jupyter notebook can’t render (PR#438).

Maintenance development:

  • Simplify linting workflow (PR#437).

Version 0.0.10 (2022-1-21)#

Highlights of this release:

New features and improvements:

  • Add number and other option for lens() (PR#406).

Documentation:

Maintenance development:

  • Cancel any previous runs that are not completed (PR#426).

  • Add skip check job (PR#425).

  • Use mamba to speed up building env (PR#422, PR#427, PR#436).

  • Test register_*_method positional arguments (PR#420).

  • Simplify CI jobs (PR#416, PR#423, PR#424)

  • Add some new pre-commit hooks (PR#407).

  • Contained ‘rc’ tag would be as ‘pre-release’ (PR#404).

  • Rename ci/envs/* to ci/env/* (PR#403).

  • Add skip check avoid frequently creating versioneer’s autoupdating PR (PR#397).

Version 0.0.9 (2022-1-10)#

Highlights of this release:

New features and improvements:

API changes:

  • Call lens() via Series.len (PR#394).

Maintenance development:

  • Draft github-action release then add changelog by manually (PR#396).

  • Fix words, a -> an (PR#387).

  • Pre-commit hooks autoupdate (PR#384).

Contribuing development:

  • Add pull request template (PR#361).

Documentation:

  • Correct sphinx method link (PR#390).

Version 0.0.8 (2022-1-1)#

Highlights of this release:

  • Publish to PyPI (PR#363).

  • Change PyPI project name from dtoolkit to my-data-toolkit (PR#382).

API changes:

  • Remove geographic_buffer() (PR#348).

Maintenance development:

Documentation:

  • Correct package name, MinMaxScaler -> OneHotEncoder (PR#374).

  • Shorten package path, dtoolkit.accessor.register -> dtoolkit.accessor (PR#373).

New contributors:

Version 0.0.7 (2021-12-30)#

Highlights of this release:

New features and improvements:

  • Extend lens() function range (PR#356).

  • New geoaccessor method utm_crs() (PR#346).

API changes:

  • Add DeprecationWarning for geographic_buffer() (PR#341).

Maintenance development:

Typing annotations:

Version 0.0.6 (2021-12-13)#

Highlights of this release:

New features and improvements:

Bug fixes:

  • Fix version number showing at sphinx home page (PR#318).

Maintenance development:

  • Publish to test.pypi.org only when event is ‘push’ (PR#337).

  • pre-commit autoupdate (PR#324).

  • Update commit message of bot (PR#321).

  • Add workflow to automatically update versioneer (PR#319, PR#333).

Documentation:

  • Documentation pathch (PR#329).

Version 0.0.5 (2021-12-6)#

Highlights of this release:

New features and improvements:

API changes:

  • Remove toolkit.geogarphy (PR#277).

Maintenance development:

Documentation:

  • Redirect py-modindex.html to reference.html (PR#314).

  • Update Readme file (PR#313).

  • Add documentation for generating geographic buffer methods (PR#308).

  • Complete top_n()’s documentation (PR#305).

New contributors:

Version 0.0.4 (2021-11-8)#

Highlights of this release:

New features and improvements:

API changes:

  • Add DeprecationWarning for toolkit.geogarphy (PR#274).

  • Keep snake name style, dropinf -> drop_inf and filterin -> filter_in (PR#249, PR#253).

Maintenance development:

  • Only publish .tar file (PR#246).

  • Use artifact to save sdist to fix different CI jobs that can’t exchange data problems (PR#242).

Documentation:

Version 0.0.3 (2021-10-21)#

New features and improvements:

Documentation:

Maintenance development:

Version 0.0.2 (2021-9-2)#

Highlights of this release:

  • Now DToolKit supports py3.9, works with Python >= 3.7 (PR#211).

New features and improvements:

  • Add transform_series_to_frame() function series to dataframe, keep the data structure in the pipeline data stream is still DataFrame (PR#202).

  • Make a generic array to frame transform function (PR#193, PR#198).

  • Simplify base Transformer, move Transformer’s __init__ and fit to MethodTF (PR#192).

  • Let update_invargs() could could use the old arguments when new are empty (PR#191).

API changes:

  • Move isin() to dtoolkit/accessor/_util.py (PR#200).

  • Drop istype (PR#189).

Bug fixes:

  • Fix error typing cause vscode plugin can’t show function’s documentation (PR#203, PR#205).

  • Fix pip show dtoolkit error homepage name (PR#201).

Typing annotations:

  • Add OneDimArray and TwoDimArray typing (PR#209).

  • Add GeoSeriesOrGeoFrame typing (PR#207).

  • Add SeriesOrFrame typing (PR#206).

  • Specific make_union() input is a list of Transformer (PR#199).

  • Rich transform()’s annotations (PR#197).

  • Fix multi_if_else()if_condition_return parameter annotation (PR#195).

  • Rremove PandasType and GeoPandasType (PR#190).

  • Fix dtoolkit.transformer._util.isin()’s annotation (PR#188).

  • Let dtoolkit.transformer._util.isin()’s axis could accept str type (PR#188).

Documentation:

  • dropinf’s inf could be any inf, not only np.inf (PR#197).

  • Update README.md contents (PR#185).

Maintenance development:

  • Use single name style whatever script or folder (PR#210).

  • Use absolute path to import parent level folder script (PR#204).

  • Drop useless comments in test files, these comments are overtime (PR#187).

  • Simplify setup.py contents (PR#185).