You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I do not know if this is a problem in the dump, the actual assignment of the GUID's, or my understanding of the data, but it seems like the db contains lots of duplication.
sqlite> SELECT COUNT(podcastGuid) number, podcastGuid FROM podcasts GROUP BY podcastGuid ORDER BY number DESC LIMIT 10;
4403|c9c7bad3-4712-514e-9ebd-d1e208fa1b76
169|
84|d9e6a1f6-b3cb-52f8-b4d6-55ae407eb310
68|cb7f498e-3b27-5d94-b342-125314350f98
62|be6f0528-aa42-5049-8198-7ae186dd71d8
61|88d3c2be-c761-5b0d-af98-3f9529fada36
56|768f6d92-769e-5890-9e18-cf35dbb1fbe9
54|f15e059b-d30f-5fbc-a2cd-076260c065a6
52|4749488e-b530-5e96-9ac8-d73d6939a04a
44|31b9658a-eebc-5c9f-9e0d-86adb2473793
In particular, the first podcastGuid in this list seems to be repeatedly assigned to many different shows...
sqlite> SELECT id, itunesId, createdOn, title FROM podcasts WHERE podcastGuid='c9c7bad3-4712-514e-9ebd-d1e208fa1b76' ORDER BY createdOn DESC LIMIT 10;
6412704||1685802762|Buster Brown – Retro Radio Podcast
6412244|1688243540|1685770796|The Real Modern Family
6412119||1685762506|Archaeology Archives – The British History Podcast
6411786||1685742788|Environment – WFHB
6411527||1685725571|Premium Archives | IBCD
6411415||1685718156|12 months of mike – Dystopian Dance Party
6411412||1685718151|jheri curl june – Dystopian Dance Party
6410893||1685678977|booking Archives - Ranking Family Records
6410806||1685674807|Hormones Archives – Green Wisdom Health
6410251||1685641259|Relegation Archives - Learn English Through Football
Note that the latest entries here date to 3/6/2023 (createOn field, converted from timestamp) , which is just 1 day before I collected this db dump.
(The oldest one dates to 7/8/2020 - I am not sure what that means)
The text was updated successfully, but these errors were encountered:
Looks like 657 out of these also have an itunesId, here are the 10 most recent ones:
sqlite> SELECT id, itunesId, createdOn, title FROM podcasts WHERE podcastGuid='c9c7bad3-4712-514e-9ebd-d1e208fa1b76' AND itunesId != '' ORDER BY createdOn DESC LIMIT 10;
6412244|1688243540|1685770796|The Real Modern Family
6393756|1688568693|1684828983|Musica
6393526|1660502054|1684808713|Geek Grills
6351312|1685701690|1683088752|STEPHANIE MILLER SHOW
6347422|1665003735|1682961552|Hope FM UK
6223448|1670155999|1679069953|On The Record
6096235|1675550917|1678274167|Tienda Online Invitada | Ropa y Accesorios | Flow112
6056478|1673263286|1677119877|ポッドキャストでファンづくり
6046057|1672761458|1676772414|James Whale
6044088|1672136346|1676688301|Story All The Way Down
I do not know if this is a problem in the dump, the actual assignment of the GUID's, or my understanding of the data, but it seems like the db contains lots of duplication.
In particular, the first podcastGuid in this list seems to be repeatedly assigned to many different shows...
Note that the latest entries here date to 3/6/2023 (createOn field, converted from timestamp) , which is just 1 day before I collected this db dump.
(The oldest one dates to 7/8/2020 - I am not sure what that means)
The text was updated successfully, but these errors were encountered: