快捷導航

python和JavaScript的正則表達式詳細使用對比

更新時間：2024年05月13日 08:28:53 作者：酌滄

正則表達式是對字符串提取的一套規(guī)則,我們把這個規(guī)則用正則里面的特定語法表達出來,去匹配滿足這個規(guī)則的字符串,這篇文章主要給大家介紹了關(guān)于python和JavaScript正則表達式詳細使用對比的相關(guān)資料,需要的朋友可以參考下

前言

?正則表達式在 Python 和 JavaScript 中都是一種強大的工具，用于匹配、搜索和操作字符串。盡管它們在基本語法上相似，但也存在一些差異。以下是 Python 和 JavaScript 在正則表達式的構(gòu)造和使用上的主要比較：

1 正則表達式的構(gòu)造和使用

特性	Python	JavaScript
導入庫	使用 re 模塊	無需導入，直接內(nèi)置
定義正則表達式	re.compile(r"pattern")	/pattern/flags 或 new RegExp("pattern", "flags")
匹配全部	re.findall(pattern, string)	string.match(/pattern/g) 或 string.matchAll(/pattern/g)
搜索（找到第一個匹配項）	re.search(pattern, string)	string.match(/pattern/) 或 string.search(/pattern/) search 僅返回第一個匹配的位置。
替換	re.sub(pattern, repl, string)	string.replace(/pattern/g, repl)
分割字符串	re.split(pattern, string)	string.split(/pattern/)
忽略大小寫標志	re.IGNORECASE 或 'i'	'i'
多行匹配標志	re.MULTILINE 或 'm'	'm'
點號匹配任意字符（包括換行符）	re.DOTALL 或 's'	無直接等價，可使用[^]來匹配任意字符包括換行
Unicode匹配	re.UNICODE 或 'u'	使用'u'標志

下面分別用 Python 和 JavaScript 的示例代碼展示正則表達式的常用操作，包括匹配、搜索、分割、替換、量詞、正向聲明、反向聲明、表達式分組和子表達式引用

import re

text = "The quick brown fox jumps over the lazy dog 123. Windows 2000 and XP Windows. test test."

# 匹配
match = re.search(r'\bfox\b', text)
if match:
    print("Match found:", match.group())  # 輸出 'fox'

# 搜索（使用量詞和表達式分組）
search_result = re.findall(r'(\b\w{4}\b)', text)
print("Search results:", search_result)  # 輸出 ['quick', 'jumps', 'over', 'lazy']

# 分割
split_result = re.split(r'\s', text)
print("Split results:", split_result)

# 替換（使用子表達式引用）
replace_result = re.sub(r'(\w+) (\w+)', r'\2 \1', text)
print("Replace results:", replace_result)

# 正向和反向聲明（Lookahead and Lookbehind）
lookahead = re.search(r'Windows(?= 2000)', text)
if lookahead:
    print("Lookahead found:", lookahead.group())  # 輸出 'Windows'

lookbehind = re.search(r'(?<=XP )Windows', text)
if lookbehind:
    print("Lookbehind found:", lookbehind.group())  # 輸出 'Windows'

# 子表達式引用
repeat_word = re.search(r'(\b\w+\b) \1', text)
if repeat_word:
    print("Repeat word found:", repeat_word.group())  # 輸出 'test test'

# 表達式分組使用
grouped = re.search(r'(\b\w+\b) over the (\b\w+\b)', text)
if grouped:
    print("Words over:", grouped.groups())  # 輸出 ('jumps', 'lazy')

let text = "The quick brown fox jumps over the lazy dog 123. Windows 2000 and XP Windows. test test.";

// 匹配
let match = text.match(/fox/);
if (match) {
    console.log("Match found:", match[0]);  // 輸出 'fox'
}

// 搜索（使用量詞和表達式分組）
let searchResult = text.match(/\b\w{4}\b/g);
console.log("Search results:", searchResult);  // 輸出 ['quick', 'jumps', 'over', 'lazy']

// 分割
let splitResult = text.split(/\s/);
console.log("Split results:", splitResult);

// 替換（使用子表達式引用）
let replaceResult = text.replace(/(\w+) (\w+)/g, '$2 $1');
console.log("Replace results:", replaceResult);

// 正向和反向聲明（Lookahead and Lookbehind）
let lookahead = text.match(/Windows(?= 2000)/);
if (lookahead) {
    console.log("Lookahead found:", lookahead[0]);  // 輸出 'Windows'
}

let lookbehind = text.match(/(?<=XP )Windows/);
if (lookbehind) {
    console.log("Lookbehind found:", lookbehind[0]);  // 輸出 'Windows'

// 子表達式引用
let repeatWord = text.match(/(\b\w+\b) \1/);
if (repeatWord) {
    console.log("Repeat word found:", repeatWord[0]);  // 輸出 'test test'
}

// 表達式分組使用
let grouped = text.match(/(\b\w+\b) over the (\b\w+\b)/);
if (grouped) {
    console.log("Words over:", grouped[1], grouped[2]);  // 輸出 'jumps', 'lazy'
}

2 正則表達式的實例方法（僅JavaScript有）

2.1. exec()

描述: 執(zhí)行對字符串的搜索匹配，并返回一個結(jié)果數(shù)組或 null。如果正則表達式包含了全局標志 (g)，每次調(diào)用 exec() 將從正則表達式的 lastIndex 屬性指定的位置開始搜索下一個匹配。
返回值: 返回一個數(shù)組，其中第 0 個元素是匹配的完整字符串，后續(xù)元素是匹配的捕獲組（如果有）。如果沒有找到匹配，則返回 null。
示例:

const regex = /(\w+)\s/g;
const text = "hello world";
let match;

while ((match = regex.exec(text)) !== null) {
    console.log(`Found ${match[0]}, next starts at ${regex.lastIndex}.`);
    // 輸出: Found hello , next starts at 6
    //       Found world, next starts at 11
}

2.2. test()

描述: 測試字符串是否匹配正則表達式的模式。
返回值: 如果找到匹配則返回 true，否則返回 false。
示例:

const regex = /hello/;
const text = "hello world";
const result = regex.test(text);  // 返回 true
console.log(result);

2.3. compile()

描述: 重新編譯正則表達式。建議避免使用它，直接創(chuàng)建新的正則表達式實例更為合適和安全。

3 正則表達式的屬性（僅JavaScript有）

3.1 實例屬性

實例屬性是綁定到正則表達式實例上的屬性。它們提供有關(guān)特定正則表達式對象的信息，每個實例的這些屬性都是獨立的。常見的實例屬性包括：

source：
- 描述：正則表達式的源文本字符串。
- 用途：允許查看創(chuàng)建正則表達式時使用的確切模式。
flags：
- 描述：標明正則表達式使用的修飾符（如 g, i, m 等）。
- 用途：快速查看正則表達式對象應(yīng)用的全局規(guī)則和配置。
lastIndex：
- 描述：下一次匹配開始的字符位置，僅在正則表達式使用全局標志 g 或粘連標志 y 時有效。
- 用途：在進行多次匹配時，控制或查詢下次匹配的起始位置。
global, ignoreCase, multiline, dotAll, unicode, sticky：
- 描述：這些布爾值屬性反映了相應(yīng)的修飾符是否被應(yīng)用于正則表達式。
- 用途：提供對正則表達式行為詳細了解的快速方式。

// 定義一個正則表達式對象，包含多個修飾符
let regex = new RegExp('foo', 'gim');

// 實例屬性
console.log("Source:", regex.source);         // 輸出: foo
console.log("Flags:", regex.flags);           // 輸出: gim
console.log("Global:", regex.global);         // 輸出: true
console.log("Ignore Case:", regex.ignoreCase); // 輸出: true
console.log("Multiline:", regex.multiline);   // 輸出: true

// 使用正則表達式進行匹配
let text = "Foo bar foo";
let match;
while ((match = regex.exec(text)) !== null) {
    console.log(`Found '${match[0]}' at index ${match.index}`);
    console.log("LastIndex after match:", regex.lastIndex); // 顯示匹配后的 lastIndex
}

3.2 靜態(tài)屬性

靜態(tài)屬性與特定的 RegExp 對象無關(guān)，而是與 RegExp 構(gòu)造函數(shù)本身關(guān)聯(lián)。這些屬性主要用于存儲有關(guān)最近一次正則表達式操作的全局信息。靜態(tài)屬性的值會在正則表達式操作后更新，并且可以在不同的匹配和搜索操作之間共享。常見的靜態(tài)屬性包括：

RegExp.input ($_)：
- 描述：存儲最近一次被匹配的完整字符串。
- 用途：可以快速查看或再次處理上一次匹配的字符串。
RegExp.lastMatch ($&)：
- 描述：存儲最近一次成功匹配的整個字符串。
- 用途：用于引用上一次匹配的結(jié)果。
RegExp.lastParen ($+)：
- 描述：存儲最近一次匹配的最后一個捕獲組。
- 用途：在需要動態(tài)訪問最后一個捕獲組時非常有用。
RegExp.leftContext ($```) 和 **RegExp.rightContext** ($'`)：
- 描述：分別存儲在最近一次匹配之前和之后的字符串部分。
- 用途：允許訪問與匹配相關(guān)的上下文信息。
RegExp.$1 到 RegExp.$9：
- 描述：存儲最近一次匹配的第1到第9個捕獲組。
- 用途：快速訪問最近一次匹配中的特定捕獲組。

let text = "Example text with 'term' and another 'term'.";
let regex = /'term'/g; // 全局搜索 'term'

// 進行匹配
regex.exec(text);
regex.exec(text);

// 靜態(tài)屬性
console.log("Last Match:", RegExp.lastMatch); // 輸出: 'term'
console.log("Last Paren:", RegExp.lastParen); // 輸出: '', 沒有捕獲組
console.log("Left Context:", RegExp.leftContext); // 輸出: Example text with 'term' and another 
console.log("Right Context:", RegExp.rightContext); // 輸出: '.

// 匹配多次后檢查靜態(tài)屬性
console.log("Input:", RegExp.input); // 輸出: Example text with 'term' and another 'term'.