Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podcastGuid is not unique in the db dump (single id maps to multiple different podcasts) #30

Open
AmitAronovitch opened this issue Jun 18, 2023 · 1 comment

Comments

@AmitAronovitch
Copy link

AmitAronovitch commented Jun 18, 2023

I do not know if this is a problem in the dump, the actual assignment of the GUID's, or my understanding of the data, but it seems like the db contains lots of duplication.

sqlite> SELECT COUNT(podcastGuid) number, podcastGuid FROM podcasts GROUP BY podcastGuid ORDER BY number DESC LIMIT 10;
4403|c9c7bad3-4712-514e-9ebd-d1e208fa1b76
169|
84|d9e6a1f6-b3cb-52f8-b4d6-55ae407eb310
68|cb7f498e-3b27-5d94-b342-125314350f98
62|be6f0528-aa42-5049-8198-7ae186dd71d8
61|88d3c2be-c761-5b0d-af98-3f9529fada36
56|768f6d92-769e-5890-9e18-cf35dbb1fbe9
54|f15e059b-d30f-5fbc-a2cd-076260c065a6
52|4749488e-b530-5e96-9ac8-d73d6939a04a
44|31b9658a-eebc-5c9f-9e0d-86adb2473793

In particular, the first podcastGuid in this list seems to be repeatedly assigned to many different shows...

sqlite> SELECT id, itunesId, createdOn, title FROM podcasts WHERE podcastGuid='c9c7bad3-4712-514e-9ebd-d1e208fa1b76' ORDER BY createdOn DESC LIMIT 10;
6412704||1685802762|Buster Brown – Retro Radio Podcast
6412244|1688243540|1685770796|The Real Modern Family
6412119||1685762506|Archaeology Archives – The British History Podcast
6411786||1685742788|Environment – WFHB
6411527||1685725571|Premium Archives | IBCD
6411415||1685718156|12 months of mike – Dystopian Dance Party
6411412||1685718151|jheri curl june – Dystopian Dance Party
6410893||1685678977|booking Archives - Ranking Family Records
6410806||1685674807|Hormones Archives – Green Wisdom Health
6410251||1685641259|Relegation Archives - Learn English Through Football

Note that the latest entries here date to 3/6/2023 (createOn field, converted from timestamp) , which is just 1 day before I collected this db dump.
(The oldest one dates to 7/8/2020 - I am not sure what that means)

@AmitAronovitch
Copy link
Author

Looks like 657 out of these also have an itunesId, here are the 10 most recent ones:

sqlite> SELECT id, itunesId, createdOn, title FROM podcasts WHERE podcastGuid='c9c7bad3-4712-514e-9ebd-d1e208fa1b76' AND itunesId != '' ORDER BY createdOn DESC LIMIT 10;
6412244|1688243540|1685770796|The Real Modern Family
6393756|1688568693|1684828983|Musica
6393526|1660502054|1684808713|Geek Grills
6351312|1685701690|1683088752|STEPHANIE MILLER SHOW
6347422|1665003735|1682961552|Hope FM UK
6223448|1670155999|1679069953|On The Record
6096235|1675550917|1678274167|Tienda Online Invitada | Ropa y Accesorios | Flow112
6056478|1673263286|1677119877|ポッドキャストでファンづくり
6046057|1672761458|1676772414|James Whale
6044088|1672136346|1676688301|Story All The Way Down

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant