Regex Vim
Table of contents
How to do an inverse search in Vim?
(i.e., get lines not containing pattern)
search for the lines not containing a pattern
To search for the lines not containing Person A
at the beginning:
/^\(Person A\)\@!
/\v^(Person A)@!
- replace it to something else:
:%s/^\(Person A\)\@!/- \0/gc
-
text to play with
Person A: Hey, how's it going? Person B: Not too bad, thanks. How about you? Person A: I'm doing well! Have you finished that book you were reading? Person B: Yes, I have. It was quite an interesting read. Person A: That's great! Do you recommend it? Person B: Absolutely, it's a must-read if you're into mystery novels. Person A: Sounds intriguing. I'll definitely check it out. Thanks for the recommendation! Person B: You're welcome! Let me know what you think once you've read it.
To search for the lines not containing Person A
and Speaker B
at the beginning:
/^\(Person A\|Person B\)\@!
/\v^(Person A|Person B)@!
- replace it to something else:
:%s/^\(Person A\|Person B\)\@!/- \0/gc
-
text to play with
Person A: Hi guys, how are you both doing today? Person B: I'm doing well, thanks for asking. How about you, Person C? Person C: I'm good too. Thanks, Person B. How about you, Person A? Person A: I'm great, thank you! Have either of you seen the new movie that just came out? Person B: Not yet, but I've heard good things about it. What about you, Person C? Person C: I actually saw it yesterday. It was really good! Person A: That's awesome! I'll have to check it out soon. Person B: Sounds like a plan. Maybe we can all go together next time. Person C: That sounds like a great idea!
lookahead and lookbehind
If you want to search for a pattern only when it occurs next to another pattern, use the regex features “lookahead” and “lookbehind” (collectively “lookaround”). If you want to search for a pattern only when it doesn’t occur next to another, use their complements, “negative lookahead” and “negative lookbehind” (“negative lookaround”).
intro
Lookahead and lookbehind are two types of zero-width assertions in regular expressions (regex). They do not match characters but instead assert whether a match is possible.
- Lookahead:
- Lookahead assertions check if a pattern matches without including the matched text in the result. There are two types of lookahead:
- Positive lookahead: Ensures that the pattern matches.
- Negative lookahead: Ensures that the pattern does not match.
- Lookahead assertions check if a pattern matches without including the matched text in the result. There are two types of lookahead:
- Lookbehind:
- Lookbehind assertions check if a pattern matches before the current position without including the matched text in the result. There are two types:
- Positive lookbehind: Ensures that the pattern matches.
- Negative lookbehind: Ensures that the pattern does not match.
- Lookbehind assertions check if a pattern matches before the current position without including the matched text in the result. There are two types:
- Key differences:
- Direction: Lookahead checks the string to the right of the current position, while lookbehind checks the string to the left.
- Length: Lookbehind requires a fixed-length pattern, while lookahead can handle variable-length patterns.
-
Examples
- Lookahead: Match “hello” only if followed by “world”.
- …
/hello(?=world)/
- …
- Lookbehind: Match “world” only if preceded by “hello”.
- …
/(?<=hello)world/
- …
- Negative lookahead: Match “hello” only if not followed by “world”.
- …
/hello(?!world)/
- …
- Negative lookbehind: Match “world” only if not preceded by “hello”.
- …
/(?<!hello)world/
- …
- Lookahead: Match “hello” only if followed by “world”.
expand
-
check the vim help
- …
" positive lookahead :h \@= " negative lookahead :h \@! " positive lookbehind :h \@<= " negative lookbehind :h \@<!
- …
\@= Matches the preceding atom with zero width. {not in Vi} Like "(?=pattern)" in Perl. Example matches foo\(bar\)\@= "foo" in "foobar" foo\(bar\)\@=foo nothing
- without wildcards:
Positive Lookahead: \(find this\)\(followed by this\|or that\)\@= Negative Lookahead: \(find this\)\(not followed by this\|or that\)\@! Positive Lookbehind: \(preceded by this\|or that\)\@<=\(find this\) Negative Lookbehind: \(not preceded by this\|or that\)\@<!\(find this\)
- with wildcards:
Positive lookahead: \(find this\)\(.*\(eventually followed by this\|or that\)\)\@= Negative lookahead: \(find this\)\(.*\(not eventually followed by this:\|or that\)\)\@! Positive lookbehind: \(\(eventually preceded by this\|or that\).*\)\@<=\(find this\) Negative lookbehind: \(\(not eventually preceded by this\|or that\).*\)\@<!\(find this\)
Note: For the wildcard versions, the extra parentheses are required so that the wildcard is excluded from the alternatives group, but is included in the lookaround group. This prevents duplicating the wildcards for every alternative. +
- …
- regular expression to find lines containing multiple specific words or patterns in any arbitrary order
- to find the word
support
andmail
at once:- …
/^\(.*support\)\@=\(.*mail\)\@=
- …
" for insensitive case /^\(.*support\)\@=\(.*mail\)\@=\c
- …
- additional hint:
- but do not contain the word
service
in the same line:- …
/^\(.*support\)\@=\(.*mail\)\@=\(.*service\)\@!
- …
- but do not contain the word
- to find the word
If you’re familiar with PCRE or other regex engines, you may prefer lookahead and lookbehind assertions. Negative lookaround is also the only way to assert that a certain pattern is not present. +
The following strings with @
are assertions that the preceding atom (which may be a group) does or does not exist just ahead or behind the current position in the pattern. The atom (or its absence) will match with zero length.
lookbehind | lookahead | |
---|---|---|
positive | \(atom\)\@<= | \(atom\)\@= |
negative | \(atom\)\@<! | \(atom\)\@! |
-
in very magic mode...
lookbehind lookahead positive (atom)@<=
(atom)@=
negative (atom)@<!
(atom)@!
Miscellaneous
expand
case insensitive search in Vim:
- You can use the
\c
escape sequence anywhere in the pattern. For example:/\ccopyright
or/copyright\c
or even/copyri\cght
- To do the inverse (case sensitive matching), use
\C
(capital C) instead.