-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gulp-concat doesnt seem to support UTF 16 #101
Comments
Is this with concat or gulp itself? |
it seems like it is concat to me considering the characters that are munged are interleaved (every other file). I figured I'd post the problem here first and see if you guys could see it and, possibly, confirm or reject if it is with gulp-concat. |
@billrawlinson Can you try just piping src to dest a bunch of times and see if that causes the issue as well? |
Sure I'll try Monday. I'm on the road now. If anyone else wants to know I figure the problem is either in the file read or Concat as the problem On Fri, Jul 10, 2015, 15:28 contra [email protected] wrote:
|
So I ran the tests where I just pipe in the files to dest and nothing funky happens to the files in the process. I've updated the test demo project to where it does both. If you want to run the tests to see the results just pull the project and give it a run. Each test now puts its results in a folder titled "results#' where # is the number of the test being run. |
I'm guessing it has something to do with buffer conversions in concat-with-sourcemaps:
Probably mixing a bunch of encodings together using node's Buffer module is causing unexpected results. |
In test example 2 (utf16le) and 3 (utf16be) the encodings are all the On Mon, Jul 13, 2015, 18:13 contra [email protected] wrote:
|
@billrawlinson I mean that the separator is treated as UTF-8, so combining that with some UTF-16 buffers might be yielding weird results |
ah, that makes perfect sense. |
I assume, due to the nature of gulp pipes that concat has no way of knowing the encoding of the various buffers coming in to it from src? |
you are correct; it is the separator character that is causing the problem. I set up the test like follows: function runConcatTest(d){
var testResults = gulp.src(d.sources)
.pipe(concat(d.outfile, { newLine: '' }))
.pipe(gulp.dest(d.outpath));
testResults.on('data', printToConsole);
} Where I basically blanked out the newLine character and the test 2 and 3 both work perfectly while test1 and 4 are all mucked up. If I don't override the newline it is broken as before. Maybe as a temporary solution just the readme could be updated to let people know if they joining UTF16 files that they should put their own newline at the end of the files and then override the join character to be nothing. UPDATE: I updated the demo project to show the working scenario with test 2 using an empty string as a the separator. |
@billrawlinson Hmm trying to think up a solution here, going to dig into the buffer docs and see if I can figure something out |
https://nodejs.org/api/buffer.html#buffer_class_method_buffer_isencoding_encoding could emit a warning if the users mixes encodings (assuming we can't figure out a way to make it work) |
I played around with this for a bit and it stumped me, @billrawlinson did you figure anything out? |
I did not. I just resorted to not using UTF 16 👎 |
Have run into the same issue and it turns out the files that end up munged are UTF16 |
When concating files which are UTF 16 Little Endian (unicode) every other file gets munged a bit.
When concatenating files which are UTF 16 Big Endian the same result happens.
If you alternate files where the first is UTF16LE and the second is UTF16BE then just the very end of the second file gets munged.
I have set up a demo project that illustrates this and has a bunch of notes that explain why I even tried these things. I don't know for certain the problem is in gulp-concat (it could be in gulp itself in
gulp.src()
. )https://github.com/finalcut/gulp-concat-bug
The text was updated successfully, but these errors were encountered: