2

I have a file containing contents(Unicode escape string in specific) like this:

title=\"\u5e2e\u4e0a\u5934\u6761\">\u5e2e\u4e0a\u5934\u6761<\/a><\/li>\n

in my working Vim window and want to show it as Chinese.

This is the output of the locale command in my terminal:

LANG=en_US.UTF•8
LANGUAGE=en
LC_CTYPE=en_US.UTF•8
LC_NUMERIC=en_US.UTF•8
LC_TIME=en_US.UTF•8
LC_COLLATE=“en_US.UTF-8”
LC_MONETARY=en_US.UTF•8
LC_MESSAGES=“en_US.UTF-8”
LC_PAPER=en_US.UTF•8
LC_NAME=en_US.UTF•8
LC_ADDRESS=en_US.UTF•8
LC_TELEPHONE=en_US.UTF•8
LC_MEASUREMENT=en_US.UTF•8
LC_IDENTIFICATION=en_US.UTF•8
LC_ALL=

And the OS I use is Ubuntu 14.04.

I know I can print the Chinese characters using Python:

print u"\u5e2e\u4e0a\u5934\u6761".encode('utf-8')
# encode the unicode string to utf-8 string which matches the terminal encoding. 
# If I write the result to a file I can see Chinese characters if I open it by vim. 

Is there a way for me to make vim to show the Chinese characters? Can I do this using vimscript inside vim with a toggle to switch between these two encoding systems if possible?

Aware that I can realize the function with the thoughts above I want to know if there is existing scripts, which will prevent me recreating wheals.

Lerner Zhang
  • 740
  • 7
  • 19
  • 1
    :set fileencoding=utf8 or :set encoding=utf8 will do :) – SibiCoder Jul 05 '16 at 12:25
  • @SibiCoder No, it seems it doesn't work. I've set it in vimrc and reopen it but nothing different happended; I encoded the file using Python to utf-8 and write it to a new file with the your setting in vimrc but it also doesn't work if I open the new file. I cannot reproduce it. – Lerner Zhang Jul 05 '16 at 12:35
  • It looks like you're using literal escaped unicode strings. That's not a problem with text encoding. I posted an answer that will convert the text to unicode. – Tommy A Jul 05 '16 at 13:26

1 Answers1

5

You could use something like this to convert to Unicode:

%s/\%(\\u\x\+\)\+/\=eval('"'.submatch(0).'"')/g

And back:

%s/[^\x00-\x7f]/\=printf('\u%x', char2nr(submatch(0)))/g

As a naive toggle command:

function! s:toggle_unicode(line1, line2) abort
  if search('\\u\x\+', 'n')
    execute a:line1.','.a:line2.'s/\%(\\u\x\+\)\+/\=eval(''"''.submatch(0).''"'')/g'
  else
    execute a:line1.','.a:line2.'s/[^\x00-\x7f]/\=printf(''\u%x'', char2nr(submatch(0)))/g'
  endif
endfunction

command! -range=% UnicodeToggle call s:toggle_unicode(<line1>, <line2>)
Tommy A
  • 6,770
  • 22
  • 36