Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FrameSubsampler broken in version 1.3.0 #311

Open
libeanim opened this issue Feb 9, 2024 · 10 comments
Open

FrameSubsampler broken in version 1.3.0 #311

libeanim opened this issue Feb 9, 2024 · 10 comments

Comments

@libeanim
Copy link
Contributor

libeanim commented Feb 9, 2024

When adding the FrameSubsampler to the config

subsampling:
    FrameSubsampler:
        args:
            frame_rate: 5
            downsample_method: 'fps'
            encode_format: 'mp4'

I get following error message:

Traceback (most recent call last):
  File "/home/evobits/miniconda3/envs/laion/lib/python3.9/site-packages/video2dataset/workers/download_worker.py", line 102, in __call__
    self.download_shard(row)
  File "/home/evobits/miniconda3/envs/laion/lib/python3.9/site-packages/video2dataset/workers/download_worker.py", line 161, in download_shard
    writer_encode_formats["video"] = self.subsamplers["video"][0].encode_formats["video"]
AttributeError: 'FrameSubsampler' object has no attribute 'encode_formats'
shard ./TEST/results2/_tmp/9.feather failed with error 'FrameSubsampler' object has no attribute 'encode_formats'

Is this a typo in the download_worker.py or is there an issue with my config?

@libeanim
Copy link
Contributor Author

libeanim commented Feb 9, 2024

Seems to be related to #263

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

Seems to be a discrepancy between encode_format and encode_formats
Need to choose one and use it everywhere

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

video_subsamplers.append(FrameSubsampler(**self.config["subsampling"]["FrameSubsampler"]["args"]))
looks like there's no s in the frame subsampler

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

writer_encode_formats["video"] = self.subsamplers["video"][0].encode_formats["video"]
so yeah that line is wrong, and this seems untested

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

Probably the easiest fix is to migrate all to encode_formats

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

https://github.com/iejMac/video2dataset/pull/287/files this was broken in this PR

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

The main problem here is the absence of test for this subsampler usage

@rom1504
Copy link
Collaborator

rom1504 commented Feb 9, 2024

https://github.com/iejMac/video2dataset/pull/271/files that fix seems to be going in the wrong direction

@libeanim
Copy link
Contributor Author

libeanim commented Feb 15, 2024

Just for testing I have added

self.encode_formats = {'video': encode_format}

to the FrameSubsampler.__init__ method.
The original problem seems to be solved but now I am getting this error:

Downloading starting now, check your bandwidth speed (with bwm-ng)your cpu (with htop), and your disk usage (with iotop)!          
Traceback (most recent call last):
  File "/home/evobits/miniconda3/envs/laion/lib/python3.9/site-packages/video2dataset/workers/download_worker.py", line 237, in download_shard                                                                                                                        
    for modality in subsampled_streams:                                                                                            
RuntimeError: dictionary keys changed during iteration                                                                             
Sample 0 failed to download: dictionary keys changed during iteration

Not entirely sure how to proceed.

@marianna13
Copy link
Contributor

what about this fix? https://github.com/marianna13/video2dataset/blob/6e9d704b687cf3a2311f565b2ca387eeed73337d/video2dataset/subsamplers/frame_subsampler.py#L39

I use the following config:

subsampling: 
    FrameSubsampler:
        args:
            frame_rate: 5
            downsample_method: 'fps'
            encode_formats: 
                video: 'mp4'

libeanim added a commit to libeanim/video2dataset that referenced this issue Feb 24, 2024
- Fix encode_formats bug iejMac#311
- Fix bug `dictionary keys changed during iteration` in FrameSubsampler
rom1504 pushed a commit that referenced this issue Feb 24, 2024
* Add keyframe subsampler to FrameSubsampler

- Fix encode_formats bug #311
- Fix bug `dictionary keys changed during iteration` in FrameSubsampler

* Format code in frame_subsampler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants