11

I'm looking at a file that has this in the .vimrc file:

set iskeyword+=@-@

I assumed it would let me delete a word with a hypen surrounded by two letters such as this-word with a dw command but it doesn't seem to work.

StevieD
  • 1,492
  • 1
  • 16
  • 24
  • OK, it looks like it adds @ as a character so it will be considered part of the word. But I don't understand the syntax here. Why the @ followed by a hyphen and another @ sign? – StevieD Feb 11 '17 at 23:57
  • Actually, I'm wrong. Result of set iskeyword is iskeyword=@,48-57,_,192-255,$,%,@-@,: with a literal @-@ in there. – StevieD Feb 12 '17 at 00:05

3 Answers3

14

From :h 'isk:

See 'isfname' for a description of the format of this option.

Then from :h 'isf:

If the character is '@', all characters where isalpha() returns TRUE
are included.  Normally these are the characters a to z and A to Z,
plus accented characters.  To include '@' itself use "@-@".

So, as you said in your comment, it seems that @-@ stands for the @ character itself.

As to why this syntax is used, I suppose the reason is because a single @ is already used to denote all characters where isalpha() returns TRUE, which usually are the characters a to z and A to Z, plus accented characters. So, @ couldn't be used to denote itself.

As a workaround, maybe the syntax used to express a range of characters was chosen instead.
The same syntax which is used, for example, to stand for all the alphabetical characters written in lowercase: a-z.

Maybe @-@ could be interpreted as the range of characters between the @ character, and the @ character, that is only the @ character and nothing else.

user9433424
  • 6,138
  • 2
  • 21
  • 30
6

Going deeper with vim's "pattern":

(content below mainly comes from neovim' s help )

Character classes

The character range matches a fixed set of characters.
A character class is similar, but the set of characters can be redefined without changing the search pattern. For example, search for this pattern:

/\f\+

The "\f" item stands for file name characters. Thus this matches a sequence of characters that can be a file name. Which characters can be part of a file name depends on the system you are using.

This is specified with the 'isfname' option.

Actually, Unix allows using just about any character in a file name, including white space. But it would make it impossible to find the end of a file name in text.

The character classes are:

    item    matches                     option ~
    \i      identifier characters       'isident'
    \k      keyword characters          'iskeyword'
    \p      printable characters        'isprint'
    \f      file name characters        'isfname'

\I is like \i, excluding digits. same for K P F

NOTE: the above also work for multibyte characters.

The ones below only match ASCII characters

                        *whitespace* *white-space*
\s  whitespace character: <Space> and <Tab>     */\s*
\S  non-whitespace character; opposite of \s    */\S*
\d  digit:              [0-9]       */\d*
\D  non-digit:          [^0-9]      */\D*
\x  hex digit:          [0-9A-Fa-f] */\x*
\X  non-hex digit:          [^0-9A-Fa-f]    */\X*
\o  octal digit:            [0-7]       */\o*
\O  non-octal digit:        [^0-7]      */\O*
\w  word character:         [0-9A-Za-z_]    */\w*
\W  non-word character:     [^0-9A-Za-z_]   */\W*
\h  head of word character:     [A-Za-z_]   */\h*
\H  non-head of word character: [^A-Za-z_]  */\H*
\a  alphabetic character:       [A-Za-z]    */\a*
\A  non-alphabetic character:   [^A-Za-z]   */\A*
\l  lowercase character:        [a-z]       */\l*
\L  non-lowercase character:    [^a-z]      */\L*
\u  uppercase character:        [A-Z]       */\u*
\U  non-uppercase character:    [^A-Z]      */\U*
NOTE: Using the atom is faster than the [] form.

NOTE: 'ignorecase', &quot;\c&quot; and &quot;\C&quot; are not used by character classes.

        */\_* *E63* */\_i* */\_I* */\_k* */\_K* */\_f* */\_F*
        */\_p* */\_P* */\_s* */\_S* */\_d* */\_D* */\_x* */\_X*
        */\_o* */\_O* */\_w* */\_W* */\_h* */\_H* */\_a* */\_A*
        */\_l* */\_L* */\_u* */\_U*

_x Where "x" is any of the characters above: The character class with end-of-line added (end of character classes)

'isfname' 'isf'

The characters specified by this option are included in file names (and path names)

default string:

1. Windows:  "@,48-57,/,\,.,-,_,+,,,#,$,%,{,},[,],:,@-@,!,~,="
2. therwise: "@,48-57,/,.,-,_,+,,,#,$,%,~,="

to see it more clear:

Under windows, you can use

@
48-57
/
\
.
-
_
+
,
#
$
%
{
}
[
]
:
@-@
!
~
=

other OS: same as above, except for:

\
{
}
[
]
:
@-@
!

Filenames are used for

  1. gf commands
  2. [i commands
    (Display the first line that contains the keyword under the cursor. The search starts at the beginning of the file )
  3. in the tags file
  4. \f in a |pattern|

only the characters <= 255 are specified with this option.
( Multi-byte characters 256 and above, are always included. For UTF-8, the characters 0xa0 to 0xff are included as well.)

The format of this option is a list of parts, separated with commas.
Each part can be

  1. a single "character number", including:
  • a decimal number between 0 and 255
  • the ASCII character itself (does not work for digits).
  1. a range: two character numbers with '-' in between. Example:

    "_,-,128-140,#-43" , meanings:

_
- (oh, looks like _)
the range 128 to 140
the range # to 43

If a part starts with ^
the following character number (or range) will be excluded. Put the excluded character after the range where it is included.

To include ^ itself, use it as the last character of the option or the end of a range. Example:

"^a-z,#,^" (exclude 'a' to 'z', include '#' and '^')

@: looks like a, alphabet represent all characters where isalpha() returns TRUE. (see :alpha: , or man isalpha )
about isalpha():

checks  for an alphabetic character; 
in the standard "C" locale, it is equivalent to (isupper(c) || islower(c)).  
In some locales, there may be additional characters for which isalpha() is true:   letters which are neither uppercase nor lowercase. ( for example: accented characters. Such é, â, î)

To include @ itself: use @-@.

Examples:

  • "@,^a-z": All alphabetic characters, excluding lower case ASCII letters.

  • "a-z,A-Z,@-@" : All letters plus the '@' character.

A comma can be included by using it where a character number is expected. Example:

  • "48-57,,,_" : Digits, comma and underscore.

A comma can be excluded by prepending a ^ Example:

  • " -~,^,,9" : All characters from space to '~', excluding comma, including <Tab>

See |option-backslash| about including spaces and backslashes.

On systems using a backslash as path separator, like Windows

Vim tries to do its best to make it work as you would expect. But it is tricky.
Vim will not remove a \ in front of a normal file name character on these systems, (but it will on *nix)

You'd better not put a space in 'isfname'.

Otherwise, Vim doesn't know where a file name starts or ends when doing completion

The & and ^ are not included by default, because these are special for cmd.exe.

'isident' (Maybe you should change 'iskeyword' instead of 'isident'.

) ident: identifiers, not indent

default:

  1. Windows: "@,48-57,_,128-167,224-235"
  2. otherwise: "@,48-57,_,192-255"

Identifiers are used in:

  1. recognizing environment variables
  2. after a match of the 'define' option.
  3. \i in a |pattern|.

(For @, only characters up to 255 are used)

If you change this option, it might break expanding environment variables.
E.g., if / is included, when Vim tries to expand "$HOME/.local/share/nvim/shada/main.shada", something bad will happend.

'iskeyword' 'isk'

local to buffer

default

nvim :    "@,48-57,_,192-255"
Vi :      "@,48-57,_"

ps: no quotation mark in nvim's help here. strange.

Keywords are used in searching and recognizing with commands like:

  • [i (Display the first line that contains the keyword)
  • w
  • *
  • \k in a |pattern|.

For @, characters above 255 will check the word character class: \w
([0-9A-Za-z_], that is, any character that is not white space or punctuation)
(An underscore, _ , is a character, not punctuation. Although it is sometimes referred to as 'underline'. These days underscores are mostly used in passwords, e-mail addresses etc)

For C programs you could use

"a-z,A-Z,48-57,_,.,-,>"

a-z
A-Z
digits
_
.
-
>

For a help file (of vim)

it is set to all non-blank printable characters except *, " and |

so that CTRL-] on a command finds the help for that command.

This option also influences syntax highlighting, unless the syntax uses |:syn-iskeyword|.

'isprint' 'isp'

global, string

default:

"@,161-255"

The characters given by this option are displayed directly on the screen.

It is also used for \p in a |pattern|. The characters from space (ASCII 32) to '~' (ASCII 126) are always displayed directly, even when they are not included in 'isprint' or (explicitly) excluded.

  0 -  31   "^@" - "^_" (Non-printable characters are displayed with two characters )
 32 - 126   always single characters

Illegal bytes from 128 to 255 (invalid UTF-8) are displayed as <xx>, with the hexadecimal value of the byte. When 'display' contains "uhex" all unprintable characters are displayed as <xx>.

The SpecialKey highlighting will be used for unprintable characters. |hl-SpecialKey|

Multi-byte characters 256 and above are always included, only the characters up to 255 are specified with this option. When a character is printable but it is not available in the current font, a replacement character will be shown.

Unprintable and zero-width Unicode characters are displayed as <xxxx>. There is no option to specify these characters.

Good Pen
  • 211
  • 2
  • 5
0

For the specific case asked for in the original question:

delete a word with a hypen surrounded by two letters such as this-word with a dw command

... you want to add - as a word character, like this:

:set iskeyword+=-

To change it back to the defaults, do this:

set iskeyword&

And if you want easy access to flexible definitions of what counts as a word, you could try using keybindings as shortcuts. Some of these are heuristics; don't take it for granted they will work exactly how you want! But ideas for your ~/.vimrc:

" Add various charcters to the list of word characters, for \< \> searches and
" for motion commands like cw diw etc.
nnoremap <leader>/- :set iskeyword+=-<CR>
nnoremap <leader>/= :set iskeyword+==<CR>
nnoremap <leader>// :set iskeyword+=/<CR>
nnoremap <leader>/; :set iskeyword+=;<CR>
nnoremap <leader>/: :set iskeyword+=:<CR>
" common filenames (heuristic)
nnoremap <leader>/f :set iskeyword+=.,#,-<CR>
" common full path (heuristic)
nnoremap <leader>/p :set iskeyword+=.,#,/,~,-<CR>
" urls (heuristic)
nnoremap <leader>/u :set iskeyword+=:,/,.,~,?,=,:,;,#,&,+,-<CR>
" shell variables (heuristic)
nnoremap <leader>/v :set iskeyword+=$,{,},[,]<CR>
" make words be whitespace-delimited, like WORD
nnoremap <leader>/W :set iskeyword=!-~<CR>
" restore default definition of words
nnoremap <leader>/w :set iskeyword&<CR>

Or for a more powerful and flexible solution, consider using the kana/vim-textobj-user plugin.

jbyler
  • 113
  • 3