12.1. Serialization About
What is serialization?
Serialization - convert object to string for save to file or send over network
Deserialization - recreating object from string with serialized data
Dumps = object -> str (convert object to string)
Loads = str -> object (reconstruct object from string)
Dump = object -> file (convert object and write result to file)
Load = file -> object (reconstruct object from data in file)
Serialization is the process of converting an object into a format that can be easily stored or transmitted and then reconstructed later. This is useful for saving the state of an object to a file, sending it over a network, or storing it in a database. Common serialization formats include JSON, XML, and binary formats like those used by the pickle module in Python.
>>> data = ('Alice', 'Bob', 'Carol')
>>>
>>> serialized = str(data)
>>> serialized
"('Alice', 'Bob', 'Carol')"
>>>
>>> unserialized = eval(serialized)
>>> unserialized
('Alice', 'Bob', 'Carol')
12.1.1. Formats
CSV
JSON
XML
YAML
Pickle
TOML
12.1.2. Dumps
object -> string
>>> def dumps(object):
... return ','.join(data)
>>>
>>>
>>> data = ('Alice', 'Bob', 'Carol')
>>> dumps(data)
'Alice,Bob,Carol'
12.1.3. Loads
string -> object
>>> def loads(string):
... return string.split(',')
>>>
>>>
>>> data = 'Alice,Bob,Carol'
>>> loads(data)
['Alice', 'Bob', 'Carol']
12.1.4. Dump
object -> file
>>> def dump(object, file):
... result = ','.join(data) + '\n'
... with open(file, mode='wt') as file:
... file.write(result)
>>>
>>>
>>> data = ('Alice', 'Bob', 'Carol')
>>> dump(data, '/tmp/myfile.csv')
$ cat /tmp/myfile.csv
Alice,Bob,Carol
12.1.5. Load
file -> object
$ echo 'Alice,Bob,Carol' > /tmp/myfile.csv
>>> def load(file):
... with open(file, mode='rt') as file:
... data = file.read().strip()
... return data.split(',')
>>>
>>>
>>> load('/tmp/myfile.csv')
['Alice', 'Bob', 'Carol']
12.1.6. Assignments
# %% About
# - Name: Serialization About Dumps
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define function `dumps()` serializing `list[str]` to `str`:
# - Argument: `data: list[str]`, `fieldseparator: str = ','`
# - Returns: `str`
# 2. Join data by field separator `fieldseparator`
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj funkcję `dumps()` serializującą `list[str]` do `str`:
# - Argument: `data: list[str]`, `fieldseparator: str = ','`
# - Zwraca: `str`
# 2. Złącz dane separatorem pól `fieldseparator`
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Example
# >>> result
# 'Alice,Bob,Carol'
# %% Hints
# - `str.join()`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 12), \
'Python 3.12+ required'
>>> result = dumps(DATA)
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is str, \
'Variable `result` has an invalid type; expected: `str`.'
>>> result
'Alice,Bob,Carol'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
from typing import Callable
dumps: Callable[[list[str], str], str]
result: str
# %% Data
DATA = ['Alice', 'Bob', 'Carol']
# %% Result
def dumps(data, fieldseparator=','):
...
# %% About
# - Name: Serialization About Loads
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define function `loads()` deserializing `str` to `list[str]`:
# - Argument: `data: str`, `fieldseparator: str = ','`
# - Returns: `list[str]`
# 2. Split data by field separator `fieldseparator`
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj funkcję `loads()` deserializującą `str` do `list[str]`:
# - Argument: `data: list[str]`, `fieldseparator: str = ','`
# - Zwraca: `str`
# 2. Rozdziel dane separatorem pól `fieldseparator`
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Example
# >>> result
# ['Alice', 'Bob', 'Carol']
# %% Hints
# - `str.split()`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 12), \
'Python 3.12+ required'
>>> result = loads(DATA)
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is list, \
'Variable `result` has an invalid type; expected: `list`.'
>>> assert all(type(x) is str for x in result), \
'Variable `result` has elements of an invalid type; all items should be: `str`.'
>>> result
['Alice', 'Bob', 'Carol']
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
from typing import Callable
loads: Callable[[str, str], list[str]]
result: list[str]
# %% Data
DATA = 'Alice,Bob,Carol'
# %% Result
def loads(data, fieldseparator=','):
...
# %% About
# - Name: Serialization About Dump
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define function `dump()` serializing `list[str]` to `str` and writing to `FILE`:
# - Argument: `data: list[str]`, `filename: str`, `fieldseparator: str = ','`
# - Returns: `None`
# 2. Join data by field separator `fieldseparator`
# 3. Write data to file using `utf-8` encoding
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj funkcję `dump()` serializującą `list[str]` do `str`:
# - Argument: `data: list[str]`, `fieldseparator: str = ','`
# - Zwraca: `str`
# 2. Złącz dane separatorem pól `fieldseparator`
# 3. Zapisz dane do pliku używając kodowania `utf-8`
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Example
# >>> print(result)
# None
# %% Hints
# - `str.join()`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 12), \
'Python 3.12+ required'
>>> result = dump(DATA, FILE)
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert result is None, \
'Variable `result` has an invalid type; expected: `None`.'
>>> from os import remove
>>> remove(FILE)
>>> print(result)
None
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
from typing import Callable
dump: Callable[[list[str], str, str], None]
result: str
# %% Data
FILE = '_temporary.dat'
DATA = ['Alice', 'Bob', 'Carol']
# %% Result
def dump(data, filename, fieldseparator=','):
...
# %% About
# - Name: Serialization About Loads
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define function `load()` reading data from `FILE` and deserializing `str` to `list[str]`:
# - Argument: `filename: str`, `fieldseparator: str = ','`
# - Returns: `list[str]`
# 2. Split data by field separator `fieldseparator`
# 3. Read data from file using `utf-8` encoding
# 4. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj funkcję `load()` deserializującą `str` do `list[str]`:
# - Argument: `filename: list[str]`, `fieldseparator: str = ','`
# - Zwraca: `str`
# 2. Rozdziel dane separatorem pól `fieldseparator`
# 3. Odczytaj dane z pliku używając kodowania `utf-8`
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Example
# >>> result
# ['Alice', 'Bob', 'Carol']
# %% Hints
# - `str.split()`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 12), \
'Python 3.12+ required'
>>> result = load(FILE)
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is list, \
'Variable `result` has an invalid type; expected: `list`.'
>>> assert all(type(x) is str for x in result), \
'Variable `result` has elements of an invalid type; all items should be: `str`.'
>>> from os import remove
>>> remove(FILE)
>>> result
['Alice', 'Bob', 'Carol']
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
from typing import Callable
load: Callable[[str,str], list[str]]
result: list[str]
# %% Data
FILE = '_temporary.dat'
with open(FILE, mode='wt', encoding='utf-8') as file:
file.write('Alice,Bob,Carol')
# %% Result
def load(filename, fieldseparator=','):
...