HGVS notation for dup in 109 becomes ins in 110 #1633

barbarian1803 · 2024-03-12T04:51:58Z

Describe the issue

For below variant:

#CHROM POS ID REF ALT QUAL FILTER INFO
chr21 5233678 . A AATTT . . .

In VEP 109.3, this variant has HGVS notation: ENST00000623753.1:n.132-758_132-755dup
In VEP 110.1/111.0, this variant has notation: ENST00000623753.1:n.132-755_132-754insAAAT
I notice a lot of similar variant that used to be dup becomes ins in VEP 110 and 111.
The correct notation would be the dup.

Another example

#CHROM POS ID REF ALT QUAL FILTER INFO
chr21 13933439 . C CT . . .
It used to be : ENST00000451663.5:n.2429+398dup
now becomes: ENST00000451663.5:n.2429+398_2429+399insA

Additional information

Run via docker for version 110.1 and VEP web for latest version v111.

System

VEP version: 110.1/111
VEP Cache version: 110/111

The text was updated successfully, but these errors were encountered:

likhitha-surapaneni · 2024-03-12T13:34:41Z

Hi @barbarian1803 ,
Thank you for reporting to us. There is a fix applied in the upcoming release to address the issue. With this, the HGVSc would be reported as dup instead of ins.

Kind regards,
Likhitha

GSYongWu · 2024-04-02T05:54:48Z

I have encountered the same problem and hope it can be updated as soon as possible.

aksenia · 2024-06-18T16:00:18Z

Hi is this still an issue in v112? thank you!

GSYongWu · 2024-06-19T07:35:12Z

This issue is resolved, but I have discovered a new problem. The CDS coordinates for some genes are incorrect. For example, the mutation SRGAP2:NM_015326.5, c.85A>T(p.T29S) has been annotated as c.994A>T(p.T332S). I suspect it is a database issue.

davmlaw · 2024-08-01T08:16:37Z

@GSYongWu - I think it's a RefSeq problem not VEP. What build are you using? if GRCh37 then looking at the RefSeq GFF, entry for NM_015326.5 - something is a bit strange - the cDNA match starts at 910 not 1

Maybe you could try using GRCh38 rather than 37 - and if the problem goes away that shows it's a RefSeq problem and they can close this issue as fixed

GSYongWu · 2024-08-02T06:28:58Z

@davmlaw Is it possible that the GFF file used by VEP this time is incorrect? causing the coordinates for certain gene annotations to be inaccurate? This part has always been correct in the older version of VEP.

davmlaw · 2024-08-02T11:35:03Z

Yes. Of course it can be wrong!

The GFF is produced by getting sequences reported by labs around the world over many many years then aligning them using automated tools (algorithms built on our understanding of biology) against a pretty arbitrary reference sequence. Something can go wrong at every single step of that process, or arbitrary decisions made you can't know which is right and it is done at massive scale.

A quick glance at the differences between refseq and Ensembl transcripts (which are trying to do pretty much the same thing) shows you the scale of how imperfect it is.

It's super useful and valuable, though! Not to knock either teams

The transcript sequences differ per version and the alignments for a given sequence can differ for a build

When working this out it helps to explicitly list the genome builds and in your examples the transcript versions for your expected results (eg NM_015326.4 is length 6781, NM_015326.5 is length 6884)

It is also better to raise a new issue for a new problem than add it to an existing unrelated issue raised by someone else, that is now fixed (as this makes it hard for the hardworkong VEP people to manage their project and keep track of issues)

likhitha-surapaneni · 2024-10-14T15:16:39Z

Hi @GSYongWu,
Can you please let us know if you are still facing this issue. I am not able to reproduce this with release/112. If you are still facing the issue, can you let us know the exact command you are using?

likhitha-surapaneni self-assigned this Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HGVS notation for dup in 109 becomes ins in 110 #1633

HGVS notation for dup in 109 becomes ins in 110 #1633

barbarian1803 commented Mar 12, 2024 •

edited

Loading

likhitha-surapaneni commented Mar 12, 2024 •

edited

Loading

GSYongWu commented Apr 2, 2024

aksenia commented Jun 18, 2024

GSYongWu commented Jun 19, 2024

davmlaw commented Aug 1, 2024 •

edited

Loading

GSYongWu commented Aug 2, 2024

davmlaw commented Aug 2, 2024 •

edited

Loading

likhitha-surapaneni commented Oct 14, 2024

HGVS notation for dup in 109 becomes ins in 110 #1633

HGVS notation for dup in 109 becomes ins in 110 #1633

Comments

barbarian1803 commented Mar 12, 2024 • edited Loading

Describe the issue

Additional information

System

likhitha-surapaneni commented Mar 12, 2024 • edited Loading

GSYongWu commented Apr 2, 2024

aksenia commented Jun 18, 2024

GSYongWu commented Jun 19, 2024

davmlaw commented Aug 1, 2024 • edited Loading

GSYongWu commented Aug 2, 2024

davmlaw commented Aug 2, 2024 • edited Loading

likhitha-surapaneni commented Oct 14, 2024

barbarian1803 commented Mar 12, 2024 •

edited

Loading

likhitha-surapaneni commented Mar 12, 2024 •

edited

Loading

davmlaw commented Aug 1, 2024 •

edited

Loading

davmlaw commented Aug 2, 2024 •

edited

Loading