Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add audb.stream() and audb.DatabaseIterator #448

Merged
merged 39 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
8c2f235
Add map argument to audb.load_table()
hagenw Aug 14, 2024
3a4c3d3
Improve docstring example
hagenw Aug 14, 2024
2b9953d
Add audb.stream()
hagenw Aug 14, 2024
d07606a
Add more ideas
hagenw Aug 14, 2024
9bbe4e7
Create first working version of stream
hagenw Aug 14, 2024
429c633
Move to extra file
hagenw Aug 15, 2024
ae8b3de
Use extra class for parquet
hagenw Aug 15, 2024
100bf6b
Add solution for csv files
hagenw Aug 15, 2024
d634de2
Add shuffle support for CSV
hagenw Aug 15, 2024
45c08d9
Finish first implementation
hagenw Aug 15, 2024
4f082ee
Add string representation
hagenw Aug 15, 2024
bc0c1ae
Adjust docstring
hagenw Aug 15, 2024
3d342ec
Add to documentation
hagenw Aug 15, 2024
02db6ea
Fix raises section of docs
hagenw Aug 15, 2024
beef85c
Use DatabaseIterator name, shuffle in example
hagenw Aug 15, 2024
1d9805c
Fix map
hagenw Aug 15, 2024
d4f101a
Simplify docstring example
hagenw Aug 15, 2024
859bc02
Inherit DatabaseIterator from audformat.Database
hagenw Aug 15, 2024
2dfac8b
Improve example
hagenw Aug 15, 2024
72b7b0b
Fix type annotations of __next__ + __iter__
hagenw Aug 15, 2024
d095739
Fix storing in flavor cache folder
hagenw Aug 16, 2024
ad779c7
Use __iter__ form audformat.Database
hagenw Aug 16, 2024
866c9a3
Revert "Use __iter__ form audformat.Database"
hagenw Aug 16, 2024
81872a6
Fix buffering
hagenw Aug 16, 2024
212501a
Add first tests
hagenw Aug 16, 2024
35de359
Add tests and fix bugs
hagenw Aug 16, 2024
9092e4a
Extend docstring of audb.stream()
hagenw Aug 19, 2024
fc79444
Add section to usage documentation
hagenw Aug 19, 2024
bd006f4
DEBUG Windows
hagenw Aug 19, 2024
e5fd6d2
DEBUG Windows
hagenw Aug 19, 2024
b78309c
Try chunks for CSV reader
hagenw Aug 19, 2024
20b736d
Restructure Iterator classes
hagenw Aug 19, 2024
1048f83
Try to fix doctest under Windows
hagenw Aug 19, 2024
59f8938
Set default value of buffer_size to 100_000
hagenw Aug 19, 2024
999056d
Update docs/load.rst
hagenw Aug 20, 2024
636fdf5
Use **kwargs to simplify code
hagenw Aug 21, 2024
676698e
Remove __init__ function in inherited Iterators
hagenw Aug 21, 2024
54ee461
Make surfe persistent repository is cleaned
hagenw Aug 21, 2024
920c857
Make DatabaseIterator abstract class
hagenw Aug 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions audb/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
from audb.core.load_to import load_to
from audb.core.publish import publish
from audb.core.repository import Repository
from audb.core.stream import DatabaseIterator
from audb.core.stream import stream


__all__ = []
Expand Down
25 changes: 23 additions & 2 deletions audb/core/load.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
from audb.core.utils import lookup_backend


CachedVersions = typing.Sequence[typing.Tuple[audeer.StrictVersion, str, Dependencies],]
CachedVersions = typing.Sequence[typing.Tuple[audeer.StrictVersion, str, Dependencies]]


def _cached_versions(
Expand Down Expand Up @@ -805,7 +805,28 @@ def _misc_tables_used_in_scheme(
if scheme.uses_table:
misc_tables_used_in_scheme.append(scheme.labels)

return list(set(misc_tables_used_in_scheme))
return audeer.unique(misc_tables_used_in_scheme)


def _misc_tables_used_in_table(
table: audformat.Table,
) -> typing.List[str]:
r"""List of misc tables that are used inside schemes of a table.

Args:
table: table object

Returns:
unique list of misc tables used in schemes of the table

"""
misc_tables_used_in_table = []
for column_id, column in table.columns.items():
if column.scheme_id is not None:
scheme = table.db.schemes[column.scheme_id]
if scheme.uses_table:
misc_tables_used_in_table.append(scheme.labels)
return audeer.unique(misc_tables_used_in_table)


def _missing_files(
Expand Down
Loading
Loading