Skip to content

GSoC 2025

Boris Sekachev edited this page Jan 28, 2025 · 28 revisions


The page contains aggregated information related to participating CVAT.ai Corporation in Google Summer of Code 2025.

Note: To the present time it is not known if CVAT.ai will participate in GSoC, as the application have to be confirmed by GSoC first.

Note: The projects list is still under discussion and may be updated until February 11 18:00 UTC


Links

Google Summer of Code references

Wiki references

CVAT resources


Primary time periods

Full timeline may be found on corresponding GSoC page

Date Description
February 11 18:00 UTC Mentoring organization application deadline
February 27 18:00 UTC List of accepted mentoring organizations published
March 24 18:00 UTC - April 8 18:00 UTC GSoC contributor application period
April 29 18:00 UTC GSoC contributor proposal rankings deadline
May 8 18:00 UTC Accepted GSoC contributor projects announced
May 8 - June 1 Community Bonding Period
June 2 Coding begins
July 14 18:00 UTC - July 18 18:00 UTC Midterm evaluation period
September 1 - 8 18:00 UTC Final evaluation period

CVAT project ideas summary

Index to Ideas Below

  1. API keys and token-based auth for SDK and CLI
  2. Timeline for tracked objects
  3. Trackable masks and tags
  4. Embedded notifications
  5. Multiple objects selection and bulk actions
  6. Account deletion and optimized resources erasing

Idea Template

All projects requires Python and TypeScript programming skills unless other noted.


CVAT project ideas

  1. IDEA: API keys and token-based auth for SDK and CLI

    • Description: Currently, the only official way to be authentificated in SDK/CLI is by providing your username and password in the requests. This approach works, however it has security issues. The idea is to provide an option for a user to generate and manage API access keys. Such a key could be used as a replacement for the login/password pair.
    • Expected Outcomes:
      • Users can generate API access tokens in the account settings in UI
      • Users may see a list of generated tokens and last time they were used
      • Users can revoke existing API access tokens in the account settings
      • Users can call API endpoints providing API access tokens
      • A token can be used for authentification in SDK/CLI
    • Resources:
    • Skills Required: Python (Django, DRF), Typescript (React)
    • Possible Mentors: Maxim Zhiltsov, Roman Donchenko
    • Difficulty: Medium
    • Duration: 175 hours
  2. IDEA: Trackable masks and tags

    • Description: CVAT supports both images and video annotation. It has two kinds of objects: shapes and tracks. Shapes are single frame objects. Tracks may live during the whole video, showing that shapes on different frames relate to the same object on a video. Tracks have additional features, like their shapes, attributes, and properties may be interpolated automatically between different keyframes. Currently that is true for all types of objects, except for masks and tags. The purpose of this task is to unify tracking functionality for all the objects.
    • Expected Outcomes:
      • Masks and tags may be tracked during a video.
      • Tags only interpolate their attributes and properties, as they do not have any position.
      • Masks additionally interpolate their position in the simplest way (position just copied between keyframes without linear interpolation, as interpolation of their position is not trivial task).
      • Existing annotation formats are updated accordingly.
    • Resources:
      • N/A
    • Skills Required: Python (Django, DRF), Typescript (React)
    • Possible Mentors: Maxim Zhiltsov, Roman Donchenko
    • Difficulty: Hard
    • Duration: 350 hours
  3. IDEA: Embedded notifications

    • Description: CVAT is annotation tool used by teams. However now it lacks any notification system. Thus all communications are responsibility of 3rdparty channels. It would be nice to have a kind of embedded notification system to make the process simpler (e.g. workers are notified about new assigned annotation jobs or new issue comments in jobs assigned by them, requesters are notified when jobs are completed, rejected)
    • Expected Outcomes:
      • UI provides a page or an overlay with notifications about recent updates
      • There may be different kinds of notification, the exact list will be defined on high level design stage
      • UI may show new notification in real time even when CVAT has closed (using service workers API provided by browsers)
    • Resources:
      • N/A
    • Skills Required: Python (Django, DRF), Typescript (React)
    • Possible Mentors: Boris Sekachev
    • Difficulty: Hard
    • Duration: 350 hours
  4. IDEA: Multiple objects selection and bulk actions

    • Description: Now annotation interface only allows working with one object at the same time. This may not to be convenient in some cases and community proposed to implement selecting many objects and making some bulk actions on all selected objects simultaneously (e.g. removing, dragging, resizing, changing labels, attributes, or properties)
    • Expected Outcomes:
      • Many objects may be selected, e.g. by holding Ctrl and clicking more objects
      • UI provides visualization of the selected group on canvas area and in objects sidebar
      • Changing property of one object applies the same changes (if possible) to others
      • Related features (e.g. undo/redo) should be updated correspondingly
    • Resources:
    • Skills Required: Typescript (React)
    • Possible Mentors: Boris Sekachev
    • Difficulty: Typescript (React)
    • Duration: 350 hours
  5. IDEA: Timeline for tracked objects

    • Description: Timeline shows summary information about selected track on a video (start/end positions, visibility, changes, navigation features, the track preview)
    • Expected Outcomes:
      • User may see a timeline with keyframes
      • Timeline shows where the track starts, ends, become visible or invisible
      • Timeline provides feature to fast navigation to the selected keyframe
      • Each keyframe on timeline shows short information (e.g. what was updated on this keyframe)
      • User may see the track preview (like animation showing tracked object on different frames)
    • Resources:
    • Skills Required: Typescript (React)
    • Possible Mentors: Kirill Lakhov
    • Difficulty: Medium
    • Duration: 175 hours
  6. IDEA: Account deletion and optimized resources erasing

    • Description: Often users want to remove their accounts and to follow personal data laws in different countries CVAT must provide such functionality. However now the process is manual and making it automatic would be a great contribution. One more related issue is that now resources are removed in main server processes and it may cause timeout issues on client when these resources require significant time to delete.
    • Expected Outcomes:
      • Users may request deleting their accounts and all associated data from GUI
      • It requires password authentification and email confirmation is email backend has configured
      • Deleting postponed for configured period of time (e.g. 24 or 48 hours) and performed in a worker
      • During this cooldown period user may abort the request
      • Deleting may not be possible in some cases (e.g. if a user is an owner of an organization)
      • Resources (tasks, projects) must be removed in workers
    • Resources:
      • N/A
    • Skills Required: Python (Django, DRF), Typescript (React)
    • Possible Mentors: Maria Khrustaleva
    • Difficulty: Medium
    • Duration: 175 hours

Idea Template

1. ## _IDEA:_ <Descriptive Title>
   * ***Description:*** 3-7 sentences describing the task
   * ***Expected Outcomes:***
      * < Short bullet list describing what is to be accomplished >
      * <i.e. create a new module called "bla bla">
      * < Has method to accomplish X >
      * <...>
   * ***Resources:***
         * [For example a paper citation](https://arxiv.org/pdf/1802.08091.pdf)
         * [For example an existing feature request](https://github.com/opencv/cvat/pull/5608)
         * [Possibly an existing related module](https://github.com/opencv/cvat/tree/develop/cvat/apps/opencv) that includes OpenCV JavaScript library.
   * ***Skills Required:*** < for example mastery plus experience coding in Python, college course work in vision that covers AI topics, python. Best if you have also worked with deep neural networks. >
   * ***Possible Mentors:*** < your name goes here >
   * ***Difficulty:*** <Easy, Medium, Hard>
   * ***Duration:*** <90, 175 or 350 hours>

Potential projects mentors

GSoC admins

Clone this wiki locally