Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent (reporting of) string encoding #1415

Closed
jennybc opened this issue Jul 7, 2017 · 8 comments
Closed

Inconsistent (reporting of) string encoding #1415

jennybc opened this issue Jul 7, 2017 · 8 comments
Milestone

Comments

@jennybc
Copy link
Contributor

jennybc commented Jul 7, 2017

Moving this from slack to here, as requested:

This mismatch between reported encoding in the Console and via knitr is confusing.

screen shot 2017-07-07 at 7 54 34 am

The little example I was using:

(string <- "hi∑")
#> [1] "hi∑"
Encoding(string)
#> [1] "unknown"
@dpprdan
Copy link
Contributor

dpprdan commented Aug 1, 2017

Is knitr using enc2native() or format() somehwere?
From console:

(string <- "hi∑")
# [1] "hi∑"
Encoding(string)
# [1] "UTF-8"
Encoding(enc2native(string))
# [1] "unknown"
Encoding(format(string))
# [1] "unknown"

In general, encoding markers seem to get lost easily. EDIT: My mistake in the linked example: Encoding markers get lost (and rightfully so) when you try to convert code points to native encoding that do not exist on the native code page.

@yihui
Copy link
Owner

yihui commented Aug 1, 2017

@dpprdan You are absolutely correct.

@dpprdan
Copy link
Contributor

dpprdan commented Aug 2, 2017

@yihui: Which one, enc2native(), format() or both?

@yihui
Copy link
Owner

yihui commented Aug 2, 2017

The former.

@dpprdan
Copy link
Contributor

dpprdan commented Aug 18, 2017

At risk of stating the obvious here, but if knitr is using enc2native() then I guess it is to be expected that the reported encoding is "unknown" since a) enc2native is changing the encoding and b) there is no code point for in most native encodings (well, at least not in my default encoding (cp1252 on Windows). Not completely sure, how this would play out on e.g. MacOS (@jennybc's default platform?!) or Linux, where it should be UTF-8?).

What do you see on the console with enc2native(string) and Encoding(enc2native(string))?

enc2native(string)
# [1] "hi<U+2211>"

@dpprdan
Copy link
Contributor

dpprdan commented Mar 24, 2019

FWIW c788aff does not fix this on Windows (not that you claimed it would, @yihui).

> (string <- "hi∑")
[1] "hi∑"
> Encoding(string)
[1] "UTF-8"
> 
> knitr::knit(text = '
+ ```{r}
+ (x <- "hi∑")
+ Encoding(x)
+ ```
+ ')
  |......................                                           |  33%
  ordinary text without R code

  |...........................................                      |  67%
label: unnamed-chunk-1
  |.................................................................| 100%
  ordinary text without R code


[1] "\n\n```r\n(x <- \"hi∑\")\n```\n\n```\n## [1] \"hi<U+2211>\"\n```\n\n```r\nEncoding(x)\n```\n\n```\n## [1] \"unknown\"\n```\n"
Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.3 (2019-03-11)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language EN                          
#>  collate  German_Germany.1252         
#>  ctype    German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2019-03-24                  
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version    date       lib source                        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.5.3)                
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.5.1)                
#>  callr         3.2.0      2019-03-15 [1] CRAN (R 3.5.3)                
#>  cli           1.1.0      2019-03-19 [1] CRAN (R 3.5.3)                
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.1)                
#>  desc          1.2.0      2018-10-25 [1] Github (r-lib/desc@7c12d36)   
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.5.1)                
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.5.1)                
#>  evaluate      0.13       2019-02-12 [1] CRAN (R 3.5.2)                
#>  fs            1.2.7      2019-03-19 [1] CRAN (R 3.5.3)                
#>  glue          1.3.1      2019-03-12 [1] CRAN (R 3.5.3)                
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.5.3)                
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.5.1)                
#>  knitr         1.22.5     2019-03-23 [1] Github (yihui/knitr@072253d)  
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.1)                
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.1)                
#>  pkgbuild      1.0.3      2019-03-20 [1] CRAN (R 3.5.3)                
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.1)                
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.1)                
#>  processx      3.3.0      2019-03-10 [1] CRAN (R 3.5.2)                
#>  ps            1.3.0      2018-12-21 [1] CRAN (R 3.5.2)                
#>  R6            2.4.0      2019-02-14 [1] CRAN (R 3.5.2)                
#>  Rcpp          1.0.1      2019-03-17 [1] CRAN (R 3.5.3)                
#>  remotes       2.0.2.9000 2019-03-23 [1] Github (r-lib/remotes@c26b7d0)
#>  rlang         0.3.2      2019-03-21 [1] CRAN (R 3.5.3)                
#>  rmarkdown     1.12       2019-03-14 [1] CRAN (R 3.5.3)                
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.1)                
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.1)                
#>  stringi       1.4.3      2019-03-12 [1] CRAN (R 3.5.3)                
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 3.5.2)                
#>  testthat      2.0.1      2018-10-13 [1] CRAN (R 3.5.1)                
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.5.1)                
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.1)                
#>  xfun          0.5        2019-02-20 [1] CRAN (R 3.5.2)                
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.1)                
#> 
#> [1] D:/Users/Daniel/Documents/R/win-library/3.5
#> [2] C:/Program Files/R/R-3.5.3/library

@yihui
Copy link
Owner

yihui commented Mar 25, 2019

@dpprdan It does not fix the issue on Windows. There are much deeper issues in base R on Windows that are beyond my control (such as r-lib/evaluate#59 as you already know).

@github-actions
Copy link

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants