Tip
This page was generated from guide/tips_about_getattr.ipynb.
Tips About Accessing Element Attributes of Series
#
Series
combins data. When Series
combines same type data, the Series
will become a container to access data attributes
. So this type of Series
could be call a string
type of Series
, a Path
type of Series
, a CRS
type of Series
and more.
Are there any ways to access data attributes of Series
?
Use
apply
method to fetch attributes.Use Pandas Object Accessor,
pandas.api.extensions.register_series_accessor
.Use
getattr
,dtoolkit.accessor.series.getattr
.
Use apply
Method#
apply
Method’s String Example#
[1]:
import pandas as pd
s = pd.Series(["hello", "world"])
s
[1]:
0 hello
1 world
dtype: object
Count the 'l'
number.
[2]:
s.apply(lambda x: x.count("l"))
[2]:
0 2
1 1
dtype: int64
Find the 'l'
.
[3]:
s.apply(lambda x: x.find("l"))
[3]:
0 2
1 3
dtype: int64
Return the element length.
[4]:
s.apply(lambda x: len(x))
[4]:
0 5
1 5
dtype: int64
apply
Method Conclusion#
From above example, we could see:
advantages
doesn’t need to pre-code firstly
support to return arbitrary result via
lambda
function
disadvantages
bad code style, need to key in a lot of nonsense codes
Use Pandas Object Accessor#
For convenience, Pandas
does a .str
accessor to access string
attributes for string
type of Series
.
.str
Accessor’s String Example#
Count the 'l'
number.
[5]:
s.str.count("l")
[5]:
0 2
1 1
dtype: int64
Find the 'l'
.
[6]:
s.str.find("l")
[6]:
0 2
1 3
dtype: int64
Return the element length.
[7]:
s.str.len()
[7]:
0 5
1 5
dtype: int64
Accessor Method Conclusion#
From above example, we could see:
advantages
keep the same code style,
'a string'.count()
&Series.str.count()
support to add additional attributes,
'a string'.__len__()
->Series.str.len()
disadvantages
need to pre-code firstly
Use getattr
#
Let us quickly glance getattr
example to show.
getattr
’s String Example#
[8]:
import dtoolkit
Count the 'l'
number.
[9]:
s.getattr("count", "l")
[9]:
0 2
1 1
dtype: int64
Find the 'l'
.
[10]:
s.getattr("find", "l")
[10]:
0 2
1 3
dtype: int64
Return the element length.
[11]:
s.getattr("__len__")
[11]:
0 5
1 5
dtype: int64
getattr
Method Conclusion#
From above example, we could see:
advantages
doesn’t need to pre-code firstly
disadvantages
not OOP style but functional style, a little bit weird for switching two different styles.
doesn’t support calling attributes that don’t exist.
Summary#
Compare these methods, we can get ranking at different backgrounds.
Flexibility / Scalability: accessor >
apply
>getattr
Ease of use
With pre-code accessor: accessor >
getattr
>apply
Without pre-code accessor:
getattr
>apply
The aim of getattr
is to quickly access element attributes of Series
attributes without more codes.
Base on that, it could quickly fetch attributes from Series element.
So it is a minimal accessor to Series.