5.8. Series Slice

5.8.1. SetUp

>>> import pandas as pd

5.8.2. Numeric Index

Series[] is used to slice the series
Series.iloc[] can be used to slice the series using numeric index
Using numeric index upper bound is exclusive!
Numeric indexes has also string index underneath

SetUp:

>>> s = pd.Series(
...     data=[1.0, 2.0, 3.0, 4.0, 5.0],
...     index=[0, 1, 2, 3, 4],
... )
>>>
>>> s
0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
dtype: float64

First two elements:

>>> s.loc[:2]
0    1.0
1    2.0
2    3.0
dtype: float64

Last two elements:

>>> s.loc[2:]
2    3.0
3    4.0
4    5.0
dtype: float64

All (starting from 1), but two last elements:

>>> s.loc[1:-2]
Series([], dtype: float64)

Every second element:

>>> s.loc[::2]
0    1.0
2    3.0
4    5.0
dtype: float64

Every second element starting from the second (element with index 1, mind, that computers starts counting with 0):

>>> s.loc[1::2]
1    2.0
3    4.0
dtype: float64

5.8.3. String Index

Series[] is used to slice the series
Series.loc[] can be used to slice the series using string index
Using string index upper and lower bound are inclusive!
String indexes has also numeric index underneath

>>> s = pd.Series(
...     data=[1.0, 2.0, 3.0, 4.0, 5.0],
...     index=['a', 'b', 'c', 'd', 'e'],
... )
>>>
>>> s
a    1.0
b    2.0
c    3.0
d    4.0
e    5.0
dtype: float64
>>>
>>> s.loc['a':'d']
a    1.0
b    2.0
c    3.0
d    4.0
dtype: float64
>>>
>>> s.loc['a':'d':2]
a    1.0
c    3.0
dtype: float64
>>>
>>> s.loc['a':'d':'b']
Traceback (most recent call last):
TypeError: '>=' not supported between instances of 'str' and 'int'
>>>
>>> s.loc['d':'a']
Series([], dtype: float64)

>>> s = pd.Series(
...     data = [1.0, 2.0, 3.0, 4.0, 5.0],
...     index = ['aaa', 'bbb', 'ccc', 'ddd', 'eee'])
>>>
>>> s
aaa    1.0
bbb    2.0
ccc    3.0
ddd    4.0
eee    5.0
dtype: float64
>>>
>>> s.loc['a':'b']
aaa    1.0
dtype: float64
>>>
>>> s.loc['a':'c']
aaa    1.0
bbb    2.0
dtype: float64

5.8.4. Date Index

Series[] can be used to slice the series using date index
Series.loc[] can be used to slice the series using date index
Using date index upper and lower bound are inclusive!
Date indexes has also numeric index underneath

>>> s = pd.Series(
...     data = [1.0, 2.0, 3.0, 4.0, 5.0],
...     index = pd.date_range('1999-12-30', periods=5))
>>>
>>> s
1999-12-30    1.0
1999-12-31    2.0
2000-01-01    3.0
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc['2000-01-02':'2000-01-04']
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc['1999-12-30':'2000-01-04':2]
1999-12-30    1.0
2000-01-01    3.0
2000-01-03    5.0
Freq: 2D, dtype: float64

>>> s.loc['1999-12-30':'2000-01-04':-1]
Series([], Freq: -1D, dtype: float64)

>>> s.loc['2000-01-04':'1999-12-30':-1]
2000-01-03    5.0
2000-01-02    4.0
2000-01-01    3.0
1999-12-31    2.0
1999-12-30    1.0
Freq: -1D, dtype: float64

>>> s.loc[:'1999']
1999-12-30    1.0
1999-12-31    2.0
Freq: D, dtype: float64

>>> s.loc['2000':]
2000-01-01    3.0
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc[:'1999-12']
1999-12-30    1.0
1999-12-31    2.0
Freq: D, dtype: float64

>>> s.loc['2000-01':]
2000-01-01    3.0
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc[:'2000-01-02']
1999-12-30    1.0
1999-12-31    2.0
2000-01-01    3.0
2000-01-02    4.0
Freq: D, dtype: float64

>>> s.loc['2000-01-02':]
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc['1999-12':'1999-12']
1999-12-30    1.0
1999-12-31    2.0
Freq: D, dtype: float64

>>> s.loc['2000-01':'2000-01-05']
2000-01-01    3.0
2000-01-02    4.0
2000-01-03    5.0
Freq: D, dtype: float64

>>> s.loc[:'2000-01-05':2]
1999-12-30    1.0
2000-01-01    3.0
2000-01-03    5.0
Freq: 2D, dtype: float64

>>> s.loc[:'2000-01-03':-1]
2000-01-03    5.0
Freq: -1D, dtype: float64

5.8.5. Assignments

# %% About
# - Name: Series Slice Datetime
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% English
# 1. Given is `DATA: pd.Series` with dates since 2000
# 2. Define `result: pd.Series` with values for dates between 2000-02-14 and end of February 2000
# 3. Run doctests - all must succeed

# %% Polish
# 1. Dany jest `DATA: pd.Series` z datami od 2000 roku
# 2. Zdefiniuj `result: pd.Series` z wartościami pomiędzy datami od 2000-02-14 do końca lutego 2000
# 3. Uruchom doctesty - wszystkie muszą się powieść

# %% Expected
# >>> result
# 2000-02-14   -0.5097
# 2000-02-15   -0.4381
# 2000-02-16   -1.2528
# 2000-02-17    0.7775
# 2000-02-18   -1.6139
# 2000-02-19   -0.2127
# 2000-02-20   -0.8955
# 2000-02-21    0.3869
# 2000-02-22   -0.5108
# 2000-02-23   -1.1806
# 2000-02-24   -0.0282
# 2000-02-25    0.4283
# 2000-02-26    0.0665
# 2000-02-27    0.3025
# 2000-02-28   -0.6343
# 2000-02-29   -0.3627
# Freq: D, dtype: float64

# %% Hints
# - `pd.Series.loc[]`

# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0

>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'

>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'

>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'

>>> assert type(result) is pd.Series, \
'Variable `result` has an invalid type; expected: `pd.Series`.'

>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)

>>> result
2000-02-14   -0.5097
2000-02-15   -0.4381
2000-02-16   -1.2528
2000-02-17    0.7775
2000-02-18   -1.6139
2000-02-19   -0.2127
2000-02-20   -0.8955
2000-02-21    0.3869
2000-02-22   -0.5108
2000-02-23   -1.1806
2000-02-24   -0.0282
2000-02-25    0.4283
2000-02-26    0.0665
2000-02-27    0.3025
2000-02-28   -0.6343
2000-02-29   -0.3627
Freq: D, dtype: float64
"""

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`

# %% Imports
import pandas as pd
import numpy as np

# %% Types
result: pd.Series

# %% Data
np.random.seed(0)

DATA = pd.Series(
    data=np.random.randn(100),
    index=pd.date_range('2000-01-01', freq='D', periods=100))

# %% Result
result = ...

# %% About
# - Name: Slicing Slice Str
# - Difficulty: easy
# - Lines: 2
# - Minutes: 5

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% English
# 1. Find middle element `DATA: pd.Series`
# 2. Slice from series 5 elements:
#    - two elements before middle
#    - one middle element
#    - two elements after middle
# 3. Run doctests - all must succeed

# %% Polish
# 1. Znajdź środkowy element `DATA: pd.Series`
# 2. Wytnij z serii 5 elementów:
#    - dwa elementy przed środkowym
#    - jeden środkowy element
#    - dwa elementy za środkowym
# 3. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `pd.Series.iloc[]`

# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0

>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'

>>> assert 'result' in globals(), \
'Variable `result` is not defined; assign result of your program to it.'

>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'

>>> assert type(result) is pd.Series, \
'Variable `result` has an invalid type; expected: `pd.Series`.'

>>> pd.set_option('display.max_columns', 50)
>>> pd.set_option('display.max_rows', 200)
>>> pd.set_option('display.width', 500)
>>> pd.set_option('display.memory_usage', 'deep')
>>> pd.set_option('display.precision', 4)

>>> result
l    98
m    98
n    22
o    68
p    75
dtype: int64
"""

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`

# %% Imports
import pandas as pd
import numpy as np

# %% Types
result: pd.Series

# %% Data
np.random.seed(0)

DATA = pd.Series(
    data=np.random.randint(10, 100, size=26),
    index=['a', 'b', 'c', 'd', 'e', 'f', 'g',
           'h', 'i', 'j', 'k', 'l', 'm', 'n',
           'o', 'p', 'q', 'r', 's', 't', 'u',
           'v', 'w', 'x', 'y', 'z']
)

# %% Result
result = ...