dtoolkit.accessor.dataframe.decompose#
- dtoolkit.accessor.dataframe.decompose(df: pd.DataFrame, /, method: TransformerMixin, columns: None | dict[Hashable | tuple[Hashable], Hashable | list[Hashable] | tuple[Hashable]] | list[Hashable] | pd.Index = None, drop: bool = False, **kwargs) pd.DataFrame [source]#
Decompose DataFrame’s columns.
- Parameters:
- methodTransformerMixin
Decomposition transformer.
- columnsdict, Series, list, tuple or None, default None
Choose columns to decompose.
None : Decompose all columns.
list or Index : Decompose the selected columns.
dict : Decompose and remap columns to a few,
{new columns: old columns}
.
- dropbool, default False
If True, drop the used columns when
columns
isdict
.- **kwargs
See the documentation for
method
for complete details on the keyword arguments.
- Returns:
- DataFrame
- Raises:
- ValueError
If the number of rows is less than the number of columns.
See also
sklearn.decomposition
Scikit-learn’s matrix decomposition transformer.
Examples
>>> import dtoolkit >>> import pandas as pd >>> from sklearn import decomposition >>> df = pd.DataFrame( ... [ ... [-1, -1, 1, 1], ... [-2, -1, 2, 1], ... [-3, -2, 3, 2], ... [1, 1, -1, -1], ... [2, 1, -2, -1], ... [3, 2, -3, -2], ... ], ... columns=["a", "b", "c", "d"], ... ) >>> df a b c d 0 -1 -1 1 1 1 -2 -1 2 1 2 -3 -2 3 2 3 1 1 -1 -1 4 2 1 -2 -1 5 3 2 -3 -2
Decompose all columns.
>>> df.decompose(decomposition.PCA) a b c d 0 1.956431 0.415183 9.009015e-17 8.100537e-18 1 3.142238 -0.355441 8.394617e-17 9.817066e-18 2 5.098670 0.059742 -8.445140e-17 1.640353e-19 3 -1.956431 -0.415183 -7.881266e-17 8.428608e-18 4 -3.142238 0.355441 -8.495664e-17 1.014514e-17 5 -5.098670 -0.059742 8.445140e-17 -1.640353e-19
Decompose the selected columns.
>>> df.decompose(decomposition.PCA, ["a", "b"]) a b c d 0 1.383406 0.293579 1 1 1 2.221898 -0.251335 2 1 2 3.605304 0.042244 3 2 3 -1.383406 -0.293579 -1 -1 4 -2.221898 0.251335 -2 -1 5 -3.605304 -0.042244 -3 -2
Decompose and remap columns to a few.
>>> df.decompose( ... decomposition.PCA, ... {"A": ["a", "b"], "B": ["b", "c", "d"]}, ... ) A B a b c d 0 1.383406 1.694316 -1 -1 1 1 1 2.221898 2.428593 -2 -1 2 1 2 3.605304 4.122909 -3 -2 3 2 3 -1.383406 -1.694316 1 1 -1 -1 4 -2.221898 -2.428593 2 1 -2 -1 5 -3.605304 -4.122909 3 2 -3 -2 >>> df.decompose( ... decomposition.PCA, ... {("A", "B"): ["a", "b", "c"]} ... ) A B a b c d 0 1.702037 0.321045 -1 -1 1 1 1 2.988071 -0.267273 -2 -1 2 1 2 4.690108 0.053773 -3 -2 3 2 3 -1.702037 -0.321045 1 1 -1 -1 4 -2.988071 0.267273 2 1 -2 -1 5 -4.690108 -0.053773 3 2 -3 -2