dtoolkit.accessor.series.textdistance#
- dtoolkit.accessor.series.textdistance(s: Series, /, other: str | Series, method: Callable = None, align: bool = True, **kwargs) Series [source]#
Return a
Series
containing the text distance to alignedother
.- Parameters:
- otherNone, str or Series
- alignbool, default True
If True, automatically aligns GeoSeries based on their indices. If False, the order of elements is preserved.
- methodCallable, default None
The method to calculate the distance. The first and second positional parameters will be compared. If None,
rapidfuzz.fuzz.ratio()
. Recommended use methods inrapidfuzz.fuzz
, andrapidfuzz.distance
.- **kwargs
Additional keyword arguments passed to
method
.
- Returns:
- Series
The values are the text distances.
- Raises:
- ModuleNotFoundError
If don’t have module named ‘rapidfuzz’.
- TypeError
If
s
is not string dtype.If
other
is not string dtype.
- ValueError
If
other
’s length is not equal to the length ofs
.
See also
rapidfuzz.fuzz
rapidfuzz.distance
textdistance_matrix
Notes
The result of comparing to None or nan value is depended on the
method
.Examples
>>> import dtoolkit >>> import pandas as pd >>> s = pd.Series(["hello", "world"]) >>> s 0 hello 1 world dtype: object >>> s.textdistance("python") 0 36.363636 1 18.181818 dtype: float64 >>> s.textdistance(pd.Series(["hello", "python"])) 0 100.000000 1 18.181818 dtype: float64