Tip
This page was generated from guide/tips_about_getattr.ipynb.
Tips About Accessing Element Attributes of Series#
Series combins data. When Series combines same type data, the Series will become a container to access data attributes. So this type of Series could be call a string type of Series, a Path type of Series, a CRS type of Series and more.
Are there any ways to access data attributes of Series?
Use
applymethod to fetch attributes.Use Pandas Object Accessor,
pandas.api.extensions.register_series_accessor.Use
getattr,dtoolkit.accessor.series.getattr.
Use apply Method#
apply Method’s String Example#
[1]:
import pandas as pd
s = pd.Series(["hello", "world"])
s
[1]:
0 hello
1 world
dtype: str
Count the 'l' number.
[2]:
s.apply(lambda x: x.count("l"))
[2]:
0 2
1 1
dtype: int64
Find the 'l'.
[3]:
s.apply(lambda x: x.find("l"))
[3]:
0 2
1 3
dtype: int64
Return the element length.
[4]:
s.apply(lambda x: len(x))
[4]:
0 5
1 5
dtype: int64
apply Method Conclusion#
From above example, we could see:
advantages
doesn’t need to pre-code firstly
support to return arbitrary result via
lambdafunction
disadvantages
bad code style, need to key in a lot of nonsense codes
Use Pandas Object Accessor#
For convenience, Pandas does a .str accessor to access string attributes for string type of Series.
.str Accessor’s String Example#
Count the 'l' number.
[5]:
s.str.count("l")
[5]:
0 2
1 1
dtype: int64
Find the 'l'.
[6]:
s.str.find("l")
[6]:
0 2
1 3
dtype: int64
Return the element length.
[7]:
s.str.len()
[7]:
0 5
1 5
dtype: int64
Accessor Method Conclusion#
From above example, we could see:
advantages
keep the same code style,
'a string'.count()&Series.str.count()support to add additional attributes,
'a string'.__len__()->Series.str.len()
disadvantages
need to pre-code firstly
Use getattr#
Let us quickly glance getattr example to show.
getattr’s String Example#
[8]:
import dtoolkit # noqa: F401
Count the 'l' number.
[9]:
s.getattr("count", "l")
[9]:
0 2
1 1
dtype: int64
Find the 'l'.
[10]:
s.getattr("find", "l")
[10]:
0 2
1 3
dtype: int64
Return the element length.
[11]:
s.getattr("__len__")
[11]:
0 5
1 5
dtype: int64
getattr Method Conclusion#
From above example, we could see:
advantages
doesn’t need to pre-code firstly
disadvantages
not OOP style but functional style, a little bit weird for switching two different styles.
doesn’t support calling attributes that don’t exist.
Summary#
Compare these methods, we can get ranking at different backgrounds.
Flexibility / Scalability: accessor >
apply>getattrEase of use
With pre-code accessor: accessor >
getattr>applyWithout pre-code accessor:
getattr>apply
The aim of getattr is to quickly access element attributes of Series attributes without more codes.
Base on that, it could quickly fetch attributes from Series element.
So it is a minimal accessor to Series.