Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Warning for wrong formatted files on opening .epub, but no errors on F7 integrity check #763

Closed
PhoenixIV opened this issue Jul 10, 2024 · 10 comments
Labels

Comments

@PhoenixIV
Copy link

Bug Description

Hi, I am new to Sigil.

I wanted to open a file I previously edited with calibre. When opening the file I got the error message that there are problems with the file and sigil can try to auto-correct them.

I chose no, because I wanted to see what is wrong. But when I then started the integrity check no errors were found.
This does not make much sense to me.

Platform (OS)

Linux

OS Version / Specifics

Mint 20.2

What version of Sigil are you using?

2.2.0

Any backtraces or crash reports

No response

@PhoenixIV PhoenixIV added the bug label Jul 10, 2024
@dougmassay
Copy link
Contributor

dougmassay commented Jul 10, 2024 via email

@kevinhendricks
Copy link
Contributor

This is not a bug. This is how Sigil is designed.

The F7 wellformed (sanity) check just checks to see if there are proper closing tags for every opening tag, ie that the text is minimally parseable. The is typically enough for other Sigil tools to actually work. It is not a full epubcheck like validation. You should download and install the Sigil epubcheck plugin if you are looking for actual validation testing.

That error message told you exactly what you were missing: one of xml header, doctype, html ,head, or body tags. In your case it was most likely the doctype that was missing as calibre does not think those are important, or the xml header.

If you are paranoid about what is being changed, you can do the following: Open your epub, say no to the fixing, make a Checkpoint, then save it and close out of Sigil. Then and reopen your epub, allow it to auto fix, then use the Checkpoint Compare to get a side by side set of changes.

Closing this as this works as designed.

@PhoenixIV
Copy link
Author

If you are paranoid about what is being changed, you can do the following: Open your epub, say no to the fixing, make a Checkpoint, then save it and close out of Sigil. Then and reopen your epub, allow it to auto fix, then use the Checkpoint Compare to get a side by side set of changes.

That's beautiful! I would not use the word paranoid - it is interesting to see. A lot is changed. Not always for the better.

This is not a bug.

Well, not knowing how simple the F7 wellformed check is, I think it is fair to assume the same outcome when running an integrity check. One could argue it makes no sense to run two different integrity checks.

Would be fun to be able to see what the assistant would like to change from the dialog window.

The W3C epubcheck tool does not care about the doctype, by the way. It does not complain. Even though it is supposed to show every little problem.

@kevinhendricks thank you for your reply. It assumed a little bit more knowledge on my side.

@PhoenixIV
Copy link
Author

The W3C epubcheck tool does not care about the doctype, by the way. It does not complain. Even though it is supposed to show every little problem.

Ah, nevermind. I just found this to be a bug: w3c/epubcheck#982

@kevinhendricks
Copy link
Contributor

Please make it very clear exactly what changes made by auto fixing you think are "Not always for the better". I have yet to see it do any damage in over 15 years of using and developing Sigil.

@PhoenixIV
Copy link
Author

PhoenixIV commented Jul 10, 2024

It's nice to see your detailed interest.

Nevermind, this was more about artifacts that are still left over from the conversion.

Many empty <p> </p> tags or other places with spaces that sigil converts to non-breaking spaces &#160;
I could get rid of those anyways.

Another thing about personal taste of indentation. E.g. I prefer tabs or four whitespaces and there does not seem to be an option. (See request #318 - I would like the ability to easily alter tab/indentation size)
And I am just not used to body being on indentation level 0, as seen in the wizard, but it makes sense.

So see this thread as complete, thanks for the help.

@kevinhendricks
Copy link
Contributor

What wizard? Sigil neither adds nor removes non-breaking spaces. It sounds like you are not properly using css to handle whitespace, indentation, margins, line spacing, etc. Or are you talking about code indentation? If the latter, If you want to change how code is displayed use Sigil's mend and prettify tools.

Asking basic questions is the purpose of our user Forum on MobileRead. We also have a Sigil User manual you may want to read.

@dougmassay
Copy link
Contributor

dougmassay commented Jul 10, 2024 via email

@PhoenixIV
Copy link
Author

Oh people, this is really getting OT. I maintained bug trackers myself and I would not have filed here if the original issue wasn't weird enough for me to let you know from a programming perspective.

calibre has inserted an invisible non-breaking space character to do the same thing, and Sigil's default Preserve Entities feature converts it to an entity.

You are right, this is what happened.

For the record: using empty p tags for vertical spacing is frowned up just about everywhere.

I frown, too. I got a file from a friend I am asked to convert. Still deciding if I want to put in that extra work or leave it as it is - good enough / works.

The most popular epub rendering engine in the world will simply ignore [empty p tags].

Thanks for the warning.

Let's leave it at that.

@dougmassay
Copy link
Contributor

Let's leave it at that.

You mean give you the last word? ;)

@Sigil-Ebook Sigil-Ebook locked as resolved and limited conversation to collaborators Jul 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants