dtoolkit.accessor.series.textdistance_matrix#
- dtoolkit.accessor.series.textdistance_matrix(s: Series, /, other: None | Series = None, method: Callable = None, **kwargs) DataFrame[source]#
Returns a
DataFramecontaining the text distances matrix between insandother.- Parameters:
- otherNone or Series, default None
If None, use
s.- methodCallable, default None
The method to calculate the distance. The first and second positional parameters will be compared. If None,
rapidfuzz.fuzz.ratio(). Recommended use methods inrapidfuzz.fuzz, andrapidfuzz.distance.- **kwargs
Additional keyword arguments passed to
method.
- Returns:
- DataFrame
The values are the text distances.
- Raises:
- ModuleNotFoundError
If don’t have module named ‘rapidfuzz’.
- TypeError
If
sis not string dtype.If
otheris not string dtype.
See also
rapidfuzz.fuzzrapidfuzz.distancetextdistance
Notes
The result of comparing to None or nan value is depended on the
method.Examples
>>> import dtoolkit >>> import pandas as pd >>> s = pd.Series(["hello", "world"]) >>> s 0 hello 1 world dtype: str >>> s.textdistance_matrix(pd.Series(["hello", "python"])) 0 1 0 100.0 36.363636 1 20.0 18.181818