Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tv_grab_uk_freeview aborting again #241

Open
misar1 opened this issue Aug 20, 2024 · 21 comments
Open

tv_grab_uk_freeview aborting again #241

misar1 opened this issue Aug 20, 2024 · 21 comments

Comments

@misar1
Copy link

misar1 commented Aug 20, 2024

I seem to have found further problems, this time with channel 18 (More 4). The first one aborted a normal 4 day run, the second when I grabbed 18 only with --debug.

could not fetch https://www.freeview.co.uk/api/program?sid=18&nid=64257&pid=crid://csi.enh.digitaluk.co.uk/6d07edf4-e83a-4d34-9349-198cb7766479&start=2024-08-23T11:30:00+0000&duration=PT30M, error: 524 Unknown code, aborting

Fetching https://www.freeview.co.uk/api/program?sid=18&nid=64257&pid=crid://csi.enh.digitaluk.co.uk/fee40905-7040-4f4a-a172-6578bbc35647&start=2024-08-21T21:00:00+0000&duration=PT1H5M from server.
could not fetch https://www.freeview.co.uk/api/program?sid=18&nid=64257&pid=crid://csi.enh.digitaluk.co.uk/fee40905-7040-4f4a-a172-6578bbc35647&start=2024-08-21T21:00:00+0000&duration=PT1H5M, error: 503 Service Unavailable, aborting

@honir
Copy link
Contributor

honir commented Aug 20, 2024

These error codes are coming from the Freeview website.

Typically, 503 Service Unavailable and 524 Server Timeout are transient errors. Rerunning the request after a short period typically succeeds. (i.e. try again later!)

A list of these internet status codes is here .

We do not control these codes -- the issue is with the Freeview server. The fetch aborts because an unknown amount of data is missing, so it is unsafe to continue.

@garybuhrmaster
Copy link
Contributor

524 is a Cloudflare response regarding timing out contacting the origin (www.freeview.co.uk) server. I would expect that error (and probably the other) to be transitory.

A quick (not robust) check right now showed both urls returned what looked like appropriate content, but that is obviously quite some time after your report.

@misar1
Copy link
Author

misar1 commented Aug 20, 2024

The issue for me is that depending where such errors occur the script can abort after collecting many hours of robust data and produce a 0 byte xml. The run today (4 days, 70 channels) should take ~5 hours and it was getting close to completion when it aborted.

I know nothing about Perl programming but is it not possible to trap errors and continue or at least use data already obtained?

@garybuhrmaster
Copy link
Contributor

I know nothing about Perl programming but is it not possible to trap errors and continue or at least use data already obtained?

In some cases perhaps, and some grabbers do make such an attempt (where it makes sense). I have added the "enhancement" label to your issue. Whether the author will agree, and when they will have the resources to consider your request is unknown.

The best answer is to learn enough perl to contribute to scratch your itch. That is what open source is about.

It is also possible that other grabbers (perhaps requiring a (paid) subscription) also provide the data you wish to obtain.

@honir
Copy link
Contributor

honir commented Aug 21, 2024

The run today (4 days, 70 channels) should take ~5 hours and it was getting close to completion when it aborted.

You could break that down into several, shorter, runs and then combine the output files using tv_cat or tv_merge.

@misar1
Copy link
Author

misar1 commented Aug 21, 2024

You could break that down into several, shorter, runs and then combine the output files using tv_cat or tv_merge.

Tried a run this morning and it aborted after 33 minutes. Seems the Freeview EPG is not viable at present.

honir added a commit that referenced this issue Aug 27, 2024
Web page cache was only used in debug mode: now used always.
@SteveIngamells
Copy link

You could break that down into several, shorter, runs and then combine the output files using tv_cat or tv_merge.

Tried a run this morning and it aborted after 33 minutes. Seems the Freeview EPG is not viable at present.

New user here... Still having these issues. Debug output indicates a lot of stuff is actually getting through and being cached.

I will try to limit the channels I am asking for to see if it means I can get sufficient to actually use something!

@honir
Copy link
Contributor

honir commented Oct 14, 2024

Is there any pattern to the Cloudflare CDN timeouts? I haven't determined one yet.

A Cloudflare error 524 hides so many different possible underlying reasons it's impossible to guess the real failure.

@SteveIngamells
Copy link

SteveIngamells commented Oct 14, 2024 via email

@misar1
Copy link
Author

misar1 commented Oct 15, 2024

Here is another "unknown code" error:
getting list of channels: ##################################################
getting listings: #######could not fetch https://www.freeview.co.uk/api/program?sid=67&nid=64257&pid=crid://csi.digitaluk.co.uk/2ba7c648-ac3c-4034-8069-5e539ca85743&start=2024-10-15T11:00:00+0000&duration=PT30M, error: 524 Unknown code, aborting

The URL shows only normal text for that programme in a browser so unless its the - after ex the problem may be a control code.

@SteveIngamells
Copy link

Here is another "unknown code" error: getting list of channels: ################################################## getting listings: #######could not fetch https://www.freeview.co.uk/api/program?sid=67&nid=64257&pid=crid://csi.digitaluk.co.uk/2ba7c648-ac3c-4034-8069-5e539ca85743&start=2024-10-15T11:00:00+0000&duration=PT30M, error: 524 Unknown code, aborting

The URL shows only normal text for that programme in a browser so unless its the - after ex the problem may be a control code.

It's a bit vague:
"When the 524 A timeout occurred status code is received, it implies that a successful HTTP Connection was made between Cloudflare and the origin server, however the HTTP Connection timed out before the HTTP request was completed. Cloudflare typically waits 100 seconds for an HTTP response and returns this HTTP status code if nothing is received."

100 seconds is a bit of a long wait!

The only thing I would suggest is a retry or two, as manual efforts to get the requested page a few moments later seem to work.

The other thing might be to skip to the next item, and somehow "block out" the missing data in the output file, based on the previous item's start time and program length and the next item's start time (assuming the next item can be found). But I am a C/C++ programmer and Perl is a mystery wrapped in an enigma to me, so how to actually do this escapes me!

@garybuhrmaster
Copy link
Contributor

Is there any pattern to the Cloudflare CDN timeouts? I haven't determined one yet.

A Cloudflare error 524 hides so many different possible underlying reasons it's impossible to guess the real failure.

I wonder if the get_nice functions should (optionally?) be enhanced to use LWP-UserAgent-Determined with appropriate retry timing numbers based on the get_nice delays and add 524 as a retryable code (as I don't think 524 is currently in the list) to automate the potential timeouts and retries with Cloudflare protected sites.

@honir
Copy link
Contributor

honir commented Oct 15, 2024

I wonder if the get_nice functions should (optionally?) be enhanced to use LWP-UserAgent-Determined with appropriate retry timing numbers based on the get_nice delays and add 524 as a retryable code (as I don't think 524 is currently in the list) to automate the potential timeouts and retries with Cloudflare protected sites.

Good idea. That's likely a useful addition to Get_nice.

@honir
Copy link
Contributor

honir commented Oct 15, 2024

In the meantime...I've added a sledgehammer retries approach. See if that helps with the transient errors.

@SteveIngamells
Copy link

In the meantime...I've added a sledgehammer retries approach. See if that helps with the transient errors.

Thanks! I will give it a try.

@SteveIngamells
Copy link

In the meantime...I've added a sledgehammer retries approach. See if that helps with the transient errors.

Thanks! I will give it a try.

Unfortunately, same problem (I didn't see any evidence of retries). Last few lines of debug output:

Fetching https://www.freeview.co.uk/api/program?sid=38&nid=64257&pid=crid://panorama.five.tv/versions/C5184410026A&start=2024-10-15T08:10:00+0000&duration=PT20M from server.
Fetching https://www.freeview.co.uk/api/program?sid=38&nid=64257&pid=crid://panorama.five.tv/versions/C5216600020A&start=2024-10-15T08:30:00+0000&duration=PT5M from server.
Fetching https://www.freeview.co.uk/api/program?sid=38&nid=64257&pid=crid://panorama.five.tv/versions/C5216600021A&start=2024-10-15T08:35:00+0000&duration=PT5M from server.
could not fetch https://www.freeview.co.uk/api/program?sid=38&nid=64257&pid=crid://panorama.five.tv/versions/C5216600021A&start=2024-10-15T08:35:00+0000&duration=PT5M, error: 524 Unknown code, aborting

Using the pre-built Windows binary xmltv 1.3.0_020

Steve

@honir
Copy link
Contributor

honir commented Oct 15, 2024

You may have been too quick for the build ;)
You need 1.3.0_021

@SteveIngamells
Copy link

You may have been too quick for the build ;) You need 1.3.0_021

Oops! Got that and now trying again!

@misar1
Copy link
Author

misar1 commented Oct 15, 2024

This may be tempting fate but the retries approach seems to be a significant improvement. I just completed two successful runs with the same 64 channels for 1 and 2 days.

On the second run there was a non-fatal error:
Use of uninitialized value $content in numeric eq (==) at \XMLTV_1\par-6d696b6573\cache-e1b1563a021779c5084d873b11553dd9e308b482\inc/script/tv_grab_uk_freeview line 534.
Line 534 is if (length($content) == 0) { which is in the sub fudgeprogs code so it looks as if that was triggered at least once.

@SteveIngamells
Copy link

SteveIngamells commented Oct 16, 2024

This may be tempting fate but the retries approach seems to be a significant improvement. I just completed two successful runs with the same 64 channels for 1 and 2 days.

On the second run there was a non-fatal error: Use of uninitialized value $content in numeric eq (==) at \XMLTV_1\par-6d696b6573\cache-e1b1563a021779c5084d873b11553dd9e308b482\inc/script/tv_grab_uk_freeview line 534. Line 534 is if (length($content) == 0) { which is in the sub fudgeprogs code so it looks as if that was triggered at least once.

Hi.

So far so good - a successful 1-day run yesterday evening with 1.3.0_021 and still going on a 7-day run started last night when I went to bed. (Taking forever but I am retrieving all channels.)

I also saw that error message in the first run. No sign in the second but it may have dropped out of the cmd window buffer!

Steve

@SteveIngamells
Copy link

SteveIngamells commented Oct 17, 2024

This may be tempting fate but the retries approach seems to be a significant improvement. I just completed two successful runs with the same 64 channels for 1 and 2 days.
On the second run there was a non-fatal error: Use of uninitialized value $content in numeric eq (==) at \XMLTV_1\par-6d696b6573\cache-e1b1563a021779c5084d873b11553dd9e308b482\inc/script/tv_grab_uk_freeview line 534. Line 534 is if (length($content) == 0) { which is in the sub fudgeprogs code so it looks as if that was triggered at least once.

Hi.

So far so good - a successful 1-day run yesterday evening with 1.3.0_021 and still going on a 7-day run started last night when I went to bed. (Taking forever but I am retrieving all channels.)

I also saw that error message in the first run. No sign in the second but it may have dropped out of the cmd window buffer!

Steve

I ran a successful 7-day download the night before last - took 20 hours! (and that's on a 10-core i9, with a 900Mbps-each-way internet connection!) Noticed a couple of erros in passing but it recovered.

Unfortunately my VBox (the reason for all this messing about) refuses to find my server to download the EPG file! More technical problems :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants