-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid extra call of safe_path in file_extension() #141
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
|
Is the speed up measurable? |
Not with the benchmarks in the docs as there we look only at 160 files. But it can be measured when using more files, e.g. Execution time when reading 100,000 files with a duration of 0.1 s and a sampling rate of 8000 Hz.
Benchmark codeimport os
import time
import numpy as np
import audeer
import audiofile
duration = 0.1
sampling_rate = 8000
repetitions = 100000
# Create files for benchmark
folder = audeer.mkdir('mix')
files = [
audeer.path(folder, f'long-{n}.wav')
for n in range(repetitions)
]
for file in files:
if not os.path.exists(file):
signal = np.random.normal(0, 0.1, (1, int(sampling_rate * duration)))
signal /= (np.max(np.abs(signal)) + 0.0001)
audiofile.write(file, signal, sampling_rate)
# benchmark
start = time.time()
for file in files:
audiofile.read(file)
end = time.time()
print(end - start) I also saw this when profiling |
Wow, I was expecting a call to |
Yes, it has a big impact if you loop over lot of files and all your other processing is relatively fast. E.g. when removing One of the advantages of using import audeer
import os
audeer.touch('test.wav')
os.symlink('test.wav', 'test.mp3')
audeer.file_extension('test.mp3') returns
|
Ok, I see. Basically, it means there is at least one disk operation we cannot avoid since we cannot derive from the string if a path is symbolic or not. |
Yes, we can only say we don't care and replace |
Does Windows support symbolic links? If not, we could at least there switch to |
I added audeering/audeer#131 for further discussion on the |
This speeds up
audiofile.read()
and all the functions fromaudiofile.core.info
by not callingaudeer.safe_path()
when getting the file extension of a file, as we calledaudeer.safe_path(file)
before.