-
-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixed Formats (DOS and UTF-8) #1370
Comments
What happens when you press "Show Anyway"? Can you load the file with |
This error message is shown when the Gtk.SourceFileLoader throws an error while loading the file. I wouldnt have thought file size would be an issue on modern hardware. |
Yes i can open it anyway but it opens it without syntax highlighting so I
need to manually set it to PHP. There's was some further strange behaviour
afterwards and I let go and started it with Kate - the standard editor from
Manjaro Linux without any issues.
But anyway I like the code editor because it's very tiny and powerful. I
want to make it to my main p
Code editor. Best search and replace I've ever seen without the need of
regex for linebreaks for example.
Jeremy Wootten ***@***.***> schrieb am So., 20. Aug. 2023,
21:02:
… What happens when you press "Show Anyway"? Can you load the file with nano
or another simple text editor?
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDPISMHEIE3KZQ726HLXWJNKVANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I'm not at home and I will try it out on my laptop soon on an simple editor
Jeremy Wootten ***@***.***> schrieb am So., 20. Aug. 2023,
21:02:
… What happens when you press "Show Anyway"? Can you load the file with nano
or another simple text editor?
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDPISMHEIE3KZQ726HLXWJNKVANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@tamer73 Thanks for the info. Could you try running Code from the terminal command line ( |
Sure I will do that as soon as possible. I'm on vacancy right now😊
Jeremy Wootten ***@***.***> schrieb am Mo., 21. Aug. 2023,
10:19:
… @tamer73 <https://github.com/tamer73> Thanks for the info. Could you try
running Code from the terminal command line (io.elementary.code) and see
what output is produced when you try to load the problematic file? You
should see a critical error message from the SourceFileLoader with more
information. If you could make the problem file available it would help
investigate the problem.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDLTI7PNK7LIKEMWZA3XWMKYDANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
sorry for answering late, but now im on my laptop and get the same result.
Opening in Terminal tells:
** (io.elementary.code:2360): CRITICAL **: 17:54:05.757: Document.vala:373:
Es gab einen Fehler bei der Zeichensatzkonvertierung und ein
Ausweichzeichen musste genutzt werden.
I can open the file with nano without any issues.
Sadly i cant provide the file, because its sensitive Code from my company!
I hope it helps anyway
Am Mo., 21. Aug. 2023 um 10:19 Uhr schrieb Jeremy Wootten <
***@***.***>:
… @tamer73 <https://github.com/tamer73> Thanks for the info. Could you try
running Code from the terminal command line (io.elementary.code) and see
what output is produced when you try to load the problematic file? You
should see a critical error message from the SourceFileLoader with more
information. If you could make the problem file available it would help
investigate the problem.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDLTI7PNK7LIKEMWZA3XWMKYDANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The translation:
There was a character set conversion error and a fallback character had to
be used.
Tamer Denizli ***@***.***> schrieb am Sa., 26. Aug. 2023, 18:00:
… sorry for answering late, but now im on my laptop and get the same result.
Opening in Terminal tells:
** (io.elementary.code:2360): CRITICAL **: 17:54:05.757:
Document.vala:373: Es gab einen Fehler bei der Zeichensatzkonvertierung und
ein Ausweichzeichen musste genutzt werden.
I can open the file with nano without any issues.
Sadly i cant provide the file, because its sensitive Code from my company!
I hope it helps anyway
Mit freundlichen Grüßen
Tamer Denizli
Brecherspitzstr. 6a
81541 München
Mobil: +49 (0) 177 / 4444478
Am Mo., 21. Aug. 2023 um 10:19 Uhr schrieb Jeremy Wootten <
***@***.***>:
> @tamer73 <https://github.com/tamer73> Thanks for the info. Could you try
> running Code from the terminal command line (io.elementary.code) and see
> what output is produced when you try to load the problematic file? You
> should see a critical error message from the SourceFileLoader with more
> information. If you could make the problem file available it would help
> investigate the problem.
>
> —
> Reply to this email directly, view it on GitHub
> <#1370 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADMSUDLTI7PNK7LIKEMWZA3XWMKYDANCNFSM6AAAAAA3XMI7UM>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Ah, OK. I wonder why Gtk.SourceLoader produces that error but |
Maybe because nano is console based? I will take a look at it soon. Just
arrived home a few minutes ago and my cats are surprisingly very impatient😂
Think I will find some time tomorrow. Have a good evening 🙋🏻♂️
Jeremy Wootten ***@***.***> schrieb am So., 27. Aug. 2023,
20:42:
… Ah, OK. I wonder why Gtk.SourceLoader produces that error but nano does
not. Not sure if we need show that information to the user or just load the
file anyway. As you found, you can use "Show Anyway" to load the file. Can
you see which character(s) have been altered? If you have the right
language pack(s) installed you should have all the character sets you need
I would have thought.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDNPJZAXNQOTCNLSU63XXOILHANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Is the original file encoded as UTF-8 or something else? It may be possible to fix this by setting candidate encodings in the loader so that more than one encoding is tried. If you are able to produce a non-sensitive file that still gives the error that would help develop a fix. |
I will take a look as soon as possible
Jeremy Wootten ***@***.***> schrieb am So., 27. Aug. 2023,
20:55:
… Is the original file encoded as UTF-8 or something else?
It may be possible to fix this by setting candidate encodings in the
loader so that more than one encoding is tried. If you are able to produce
a non-sensitive file that still gives the error that would help develop a
fix.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDO556DD4ND4UKMOHA3XXOJZHANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
after loading the file anyway, its necessary to choose php file for syntax
highlighting. But major issue here is that i can't save changes! Save as
dont works too btw...
I can't see any changes but its hard to track 4477 lines of code.
After closing the editor after starting over zsh i get following errors:
(io.elementary.code:2044): GLib-GObject-CRITICAL **: 12:49:06.607:
../glib/gobject/gsignal.c:2778: instance '0x5609752ab400' has no handler
with id '575'
(io.elementary.code:2044): GLib-GObject-CRITICAL **: 12:49:06.614:
../glib/gobject/gsignal.c:2778: instance '0x560975511cd0' has no handler
with id '3329'
(io.elementary.code:2044): GLib-GObject-CRITICAL **: 12:49:06.615:
../glib/gobject/gsignal.c:2778: instance '0x5609755368f0' has no handler
with id '3445'
(io.elementary.code:2044): GLib-GObject-CRITICAL **: 12:49:06.615:
../glib/gobject/gsignal.c:2778: instance '0x5609755325e0' has no handler
with id '3414'
I think there are 4 errors because i*ve tried to save the file including
save as
But i think this will help:
file edit-test.php
edit-test.php: HTML document, ISO-8859 text, with very long lines (3330),
with CRLF, CR line terminators
Hope this will help meanwhile - need to pickup gf now
Am So., 27. Aug. 2023 um 20:42 Uhr schrieb Jeremy Wootten <
***@***.***>:
… Ah, OK. I wonder why Gtk.SourceLoader produces that error but nano does
not. Not sure if we need show that information to the user or just load the
file anyway. As you found, you can use "Show Anyway" to load the file. Can
you see which character(s) have been altered? If you have the right
language pack(s) installed you should have all the character sets you need
I would have thought.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDNPJZAXNQOTCNLSU63XXOILHANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@tamer73 I think you need to post the test file to e.g. https://pastebin.com/ or maybe use the "Attach files by dragging & dropping, selecting or pasting them" function at the bottom of the GitHub comment box (although I've only ever used that for pictures). Or you could send it as an email attachment to [email protected] or [email protected] |
I'm really sorry that i cant do that - its too sensitive data. Tried
already to narrow it down to non sensitive parts but thats really to much
effort for me now!
But i think i found the issue while trying to find the longest line!
awk 'length > max_length { max_length = length; longest_line = $0 } END {
print longest_line }' edit-test.php
1|127 ✘
awk: Kommandozeile:1: (FILENAME=edit-test.php FNR=361)
...
Warnung: Es wurden unbekannte Multibyte-Daten gefunden. Ihre Daten
entsprechen eventuell nicht der gesetzten Region
...
*Translation of the warning:*
Warning: Unknown multibyte data was found. Your data may not correspond to
the set region
I'm just wondering that i cant reproduce this on other editors.
Now installing notepadqq which was my favourite editor in the past. its an
lightweight editor and takes less then geany but much more then elementary-code due to its dependencies on actual manjaro linux
Am Mo., 28. Aug. 2023 um 16:20 Uhr schrieb Jeremy Wootten <
***@***.***>:
… @tamer73 <https://github.com/tamer73> I think you need to post the test
file to e.g. https://pastebin.com/ or maybe use the "Attach files by
dragging & dropping, selecting or pasting them" function at the bottom of
the GitHub comment box (although I've only ever used that for pictures). Or
you could send it as an email attachment to ***@***.*** or
***@***.***
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDNOKSQ54IZEIK6IOX3XXSSLVANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
ok now i narrowed it down to a php file with 125 Byte and it looks like the issue is relevant when this non utf-8 happens multiple times! Looks like an temporary buffer overflow when too much handling exceptions accur!
Notepadqq doesnt have issues with this file.
I have send an email to you with this file. It shows different handling of non UTF-8 with the same file on notepadqq
and elementary/code! My locale is set to UTF-8
Finally a small file to reproduce the issue
Am Mo., 28. Aug. 2023 um 16:20 Uhr schrieb Jeremy Wootten <
***@***.***>:
… @tamer73 <https://github.com/tamer73> I think you need to post the test
file to e.g. https://pastebin.com/ or maybe use the "Attach files by
dragging & dropping, selecting or pasting them" function at the bottom of
the GitHub comment box (although I've only ever used that for pictures). Or
you could send it as an email attachment to ***@***.*** or
***@***.***
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDNOKSQ54IZEIK6IOX3XXSSLVANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks for your efforts in narrowing down the cause ❤️ - I'll try to get a fix out soon. |
Glad to help! You're welcome. You've put so much effort into this and I
really like small footprint projects like this. Don't know any other
graphical editor which is so tiny and powerful. Despite its so tiny I
It has the best search and replace outside of regex.
Thank you too for this masterpiece 😊👍
Jeremy Wootten ***@***.***> schrieb am Di., 29. Aug. 2023,
10:12:
… Thanks for your efforts in narrowing down the cause ❤️ - I'll try to get a
fix out soon.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDMQZZQIMDTGHS65RKTXXWP7NANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
So it seems that your file is encoded in "DOS format" (according to
and by Code to
Two characters have been replaced by "unknown character" characters. If you use "Save As" in Code to save the file (immediately after using "Show Anyway") with either the same or different name, close the original tab and then open the saved file it loads correctly and is recognized as PHP. This is actually intended behaviour for dealing with what Code thinks are potentially corrupted/non-text files - it stops you trying to edit them and potentially make things worse. However, in this case the original file was misidentified as problematic due to DOS encoding which, it appears, the Gtk.SourceLoader does not handle properly by default. I'll see if there is a way round this. |
That's strange, in one of my trying I've deleted everything else and then
code opened it without and messages. That's why I thought there must be
some more issues together to get that issue.
Now I have opened that file with notepadqq over command line and no errors occurs there which would help...
Jeremy Wootten ***@***.***> schrieb am Di., 29. Aug. 2023,
10:48:
… So it seems that your file is encoded in "DOS format" (according to nano)
and the culprit lint is converted by nano to
//######################################## �berpr�fen ########################################
and by Code to
//######################################## \FCberpr\FCfen ########################################
Two characters have been replaced by "unknown character" characters.
If you use "Save As" in Code to save the file (immediately after using
"Show Anyway") with either the same or different name, close the original
tab and then open the saved file it loads correctly and is recognized as
PHP. This actually intended behaviour for dealing with what Code thinks are
potentially corrupted/non-text files - it stops you trying to edit them and
potentially make things worse.
However, in this case the original file was misidentified as problematic
due to DOS encoding which, it appears, the Gtk.SourceLoader does not handle
properly by default. I'll see if there is a way round this.
—
Reply to this email directly, view it on GitHub
<#1370 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMSUDJYO5TBQSPLMQXKMDTXXWUFJANCNFSM6AAAAAA3XMI7UM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I can convert you file so that Code shows the expected characters (I presume) using the command:
I sent the output to a separate file to avoid overwriting the original. Opening the converted file shows:
Doing an octal dump on the original shows that the problem characters are encoded as hexadecimal I presume the file comes from a Windows system? Would you be wanting to return it to Windows after editing on Linux? |
Looking into this it is surprisingly complicated to fix. I can get Code to load the file with the Windows character set by forcing the loader to use that charset/encoding - but then "normal" Linux files have some characters misinterpreted. There does not seem to be any guaranteed way to get the encoding and charset automatically from the file before actually loading it and it seems the Gtk.SourceFileLoader only detects the encoding, not the characterset during actually loading it. I see NotepadQQ has gone to a lot of trouble to handle a wide variety of encodings/charsets and allows the user to choose and convert between them so it is clearly possible. However, Code is primarily targeted at developing software on Linux and there is limited resources for its development. As this is an edge case it may not be fixed soon. Probably the best we could do is to offer a choice of character sets to the user to try out on the file if it fails to load - this assumes the user knows what character set to choose is though. The best way forward for you is probably to convert the file out of an old unsupported format and into a modern one that both Linux and Windows support out of the box. |
Thanks for taking a look at it. I'm ok with it and I get easily around it. Just wanted to inform you and at least we could get a little more light into this behaviour |
The source of this file is close to twenty years old. Needs to be reprogrammed anyway so ain't no worries. This was created from my boss in a way no one would do today anymore. So everything's fine and there's no pressure from any side :-) Should I close that here? |
Well I'll leave it open with a revised description as it is a valid issue, but it will probably have a low priority for fixing unless another dev can see an easy fix. |
What Happened?
Error message says "No text Found. Maybe corrupt or no text file" while trying to open an 178kb php file with code version 7.1.0-1 on Manjaro Linux with everything up to date. Every other php file is working fine like expected! Looks like code cant read the file into the buffer so it event cant realize its a php file
Steps to Reproduce
Expected Behavior
Just wanted to view the php code
OS Version
Other Linux
Software Version
Latest release (I have run all updates)
Log Output
No response
Hardware Info
CPU: dual core Intel Core i3-4130 (-MT MCP-) speed/min/max: 1450/800/3400 MHz
Kernel: 6.1.44-1-MANJARO x86_64 Up: 29m Mem: 3.15/11.6 GiB (27.1%)
Storage: 461.98 GiB (3.9% used) Procs: 194 Shell: Zsh inxi: 3.3.29
The text was updated successfully, but these errors were encountered: