目录 1. 不用正则表达式来查找文本模式 2. 用正则表达式来查找文本模式 2.1 创建正则表达式(Regex)对象 2.2 匹配Regex对象 3. 用正则表达式匹配更多模式 3.1 利用括号分组 3.2 用管道匹配多个分组 3.3 用问号实现可选匹配 3.4 用星号匹配零次或多次 3.5 用加号匹配一次或多次 3.6 用花括号匹配特定次数 4. 贪心和非贪心匹配 5. findall() 方法 6. 字符分类 7. 建立自己的字符分类 8. 插入字符和美元字符 9. 通配字符 9.1 用点-星匹配所有字符 9.2 用句点字符匹配换行符 10. 不区分大小写的匹配 11. 用 sub() 方法替换字符串
1. 不用正则表达式来查找文本模式
def isPhoneNumber ( text) : if len ( text) != 11 : return False for i in range ( 0 , len ( text) ) : if ( i == 3 or i == 7 ) and text[ i] != "-" : return False elif i != 3 and i != 7 and not text[ i] . isdecimal( ) : return False return True text = "123-456-789"
print ( text)
print ( isPhoneNumber( text) )
2. 用正则表达式来查找文本模式
2.1 创建正则表达式(Regex)对象
import retext = re. compile ( r'\d\d\d-\d\d\d-\d\d\d' )
2.2 匹配Regex对象
import retext = re. compile ( r'\d\d\d-\d\d\d-\d\d\d' )
match = text. search( "~~~123-456-789~~~" )
print ( match . group( ) )
3. 用正则表达式匹配更多模式
3.1 利用括号分组
import retext = re. compile ( r'(\d\d\d)-(\d\d\d-\d\d\d)' )
match = text. search( "~~~123-456-789~~~" )
print ( match . group( 1 ) )
print ( match . group( 2 ) )
print ( match . groups( ) )
3.2 用管道匹配多个分组
import retext = re. compile ( r'456|123' )
match = text. search( "123-456-789" )
print ( match . group( ) )
3.3 用问号实现可选匹配
import retext = re. compile ( r'\d\d\d(~)?\d\d\d' )
match = text. search( "123123" )
print ( match . group( ) )
match = text. search( "123~123" )
print ( match . group( ) )
3.4 用星号匹配零次或多次
import retext = re. compile ( r'\d\d\d(~)*\d\d\d' )
match = text. search( "123123" )
print ( match . group( ) )
match = text. search( "123~~~123" )
print ( match . group( ) )
3.5 用加号匹配一次或多次
import retext = re. compile ( r'\d\d\d(~)+\d\d\d' )
match = text. search( "123~123" )
print ( match . group( ) )
match = text. search( "123~~~123" )
print ( match . group( ) )
3.6 用花括号匹配特定次数
import retext = re. compile ( r'\d\d\d(~){3,5}\d\d\d' )
match = text. search( "123~~~123" )
print ( match . group( ) )
match = text. search( "123~~~~~123" )
print ( match . group( ) )
4. 贪心和非贪心匹配
贪心匹配:尽可能匹配最长的字符串 非贪心匹配:尽可能匹配最短的字符串
import retext = re. compile ( r'(123 ){2,4}' )
match = text. search( "123 123 123 123 123 " )
print ( match . group( ) )
text = re. compile ( r'(123 ){2,4}?' )
match = text. search( "123 123 123 123 123 " )
print ( match . group( ) )
5. findall() 方法
import retext = re. compile ( r'\d\d\d-\d\d\d-\d\d\d' )
match = text. search( "~~~123-456-789~~~111-222-333~~~" )
print ( match . group( ) )
match = text. findall( "~~~123-456-789~~~111-222-333~~~" )
print ( match )
6. 字符分类
编写字符分类 表示 \d 0~9的任何数字 \D 除0~9的数字以外的任何字符 \w 任何字母、数字和下划线字符 \W 除字母、数字和下划线以外的任何字符 \s 空格、制表符或换行符 \S 除空格、制表符和换行符以外的任何字符
7. 建立自己的字符分类
import retext = re. compile ( r'[0-5]' )
match = text. findall( "1a2b3c4d" )
print ( match )
text = re. compile ( r'[abc]' )
match = text. findall( "1a2b3c4d" )
print ( match )
text = re. compile ( r'[^abc]' )
match = text. findall( "1a2b3c4d" )
print ( match )
8. 插入字符和美元字符
import retext = re. compile ( r'^\d\d\d' )
match = text. search( "123abc456" )
print ( match )
text = re. compile ( r'\d\d\d$' )
match = text. search( "123abc456" )
print ( match )
9. 通配字符
import retext = re. compile ( r'..23' )
match = text. findall( "123abc23" )
print ( match )
9.1 用点-星匹配所有字符
import retext = re. compile ( r'123(.*)456(.*)' )
match = text. findall( "123abc456def" )
print ( match )
9.2 用句点字符匹配换行符
re.DOTALL :让句点字符匹配所有字符(包括换行符)
import retext = re. compile ( r'.*' )
match = text. search( "123abc\n456def" )
print ( match . group( ) )
text = re. compile ( r'.*' , re. DOTALL)
match = text. search( "123abc\n456def" )
print ( match . group( ) )
10. 不区分大小写的匹配
import retext = re. compile ( r'abc' , re. I)
match = text. findall( "abcABC" )
print ( match )
11. 用 sub() 方法替换字符串
import retext = re. compile ( r'ABC\w*' )
match = text. sub( "abc" , "ABC : 123" )
print ( match )