[救救菜鸟] 怎么利用正则表达式截取 A 字符串之后 B 字符串(若存在)之前的子串

2022-01-07 10:14:39 +08:00
 outman87
Caused by: errorCode=QUERYILLEGAL0001 errorMsg= codeMsg=QRYILLEGAL0096
Caused by: java.lang.ArrayIndexOutOfBoundsException
以上两个字符串,想要截取(非匹配)Caused by: 之后的内容,直到遇见 errorMsg (倘若存在)。请问正则表达式应该怎么写?

str1 = 'Caused by: errorCode=QUERYILLEGAL0001 errorMsg= codeMsg=QRYILLEGAL0096'
str2 = "Caused by: java.lang.ArrayIndexOutOfBoundsException"
res1 = re.findall(r'Caused by: ((?!errorMsg=).)*', str1)
res2 = re.findall(r'Caused by: (?:(?!errorMsg=).)*', str1)
res3 = re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str1)
print(res1)
print(res2)
print(res3)
res4 = re.findall(r'Caused by: ((?!errorMsg=).)*', str2)
res5 = re.findall(r'Caused by: (?:(?!errorMsg=).)*', str2)
res6 = re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str2)
print(res4)
print(res5)
print(res6)

=================== RESTART: C:\Users\anonymous\Desktop\test.py ===================
[' ']
['Caused by: errorCode=QUERYILLEGAL0001 ']
['errorCode=QUERYILLEGAL0001 ']
['n']
['Caused by: java.lang.ArrayIndexOutOfBoundsException']
['java.lang.ArrayIndexOutOfBoundsException']

三种写法:
re.findall(r'Caused by: ((?!errorMsg=).)*', str)
re.findall(r'Caused by: (?:(?!errorMsg=).)*', str)
re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str)

只有奇葩的第三种满足需求。为什么第一种得不到想要的结果?请问还有其它更“优雅”更简约的写法吗?

感谢!!
1377 次点击
所在节点    Python
3 条回复
b1iy
2022-01-07 10:26:01 +08:00
盲猜
regex = r"(?<=Caused\sby:)(\N+(?=errorMsg)|\N+)"
b1iy
2022-01-07 10:27:05 +08:00
复制错了

regex = r"(?<=Caused\sby:)(.+(?=errorMsg)|.+)"
outman87
2022-01-07 10:57:56 +08:00
@b1iy 可以用!感谢老师。前瞻基本上没用过,我自己琢磨下。

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/826742

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX