文章詳情頁

Python：UserWarning：此模式具有匹配組。要實(shí)際獲得組，請使用str.extract

瀏覽：13日期：2022-08-07 13:30:45

如何解決Python：UserWarning：此模式具有匹配組。要實(shí)際獲得組，請使用str.extract？

中的至少一個(gè)正則表達(dá)式模式urls必須使用捕獲組。 str.contains僅針對其中的每一行返回True或Falsedf[’event_time’]－不使用捕獲組。因此，UserWarning警告您正則表達(dá)式使用捕獲組，但未使用匹配項(xiàng)。

如果要刪除，則UserWarning可以從正則表達(dá)式模式中找到并刪除捕獲組。它們沒有顯示在您發(fā)布的正則表達(dá)式模式中，但是它們必須在您的實(shí)際文件中。在字符類之外查找括號。

或者，您可以通過以下方式禁止此特定的UserWarning

import warningswarnings.filterwarnings('ignore', ’This pattern has match groups’)

在致電之前str.contains。

這是一個(gè)簡單的示例，演示了問題（和解決方案）：

# import warnings# warnings.filterwarnings('ignore', ’This pattern has match groups’) # uncomment to suppress the UserWarningimport pandas as pddf = pd.DataFrame({ ’event_time’: [’gouda’, ’stilton’, ’gruyere’]})urls = pd.DataFrame({’url’: [’g(.*)’]}) # With a capturing group, there is a UserWarning# urls = pd.DataFrame({’url’: [’g.*’]}) # Without a capturing group, there is no UserWarning. Uncommenting this line avoids the UserWarning.substr = urls.url.values.tolist()df[df[’event_time’].str.contains(’|’.join(substr), regex=True)]

版畫

script.py:10: UserWarning: This pattern has match groups. To actually get the groups, use str.extract. df[df[’event_time’].str.contains(’|’.join(substr), regex=True)]

從正則表達(dá)式模式中刪除捕獲組：

urls = pd.DataFrame({’url’: [’g.*’]})

避免了UserWarning。

解決方法

我有一個(gè)數(shù)據(jù)框，我嘗試獲取字符串，其中的列上包含一些字符串Df像

member_id,event_path,event_time,event_duration30595,'2016-03-30 12:27:33',yandex.ru/,130595,'2016-03-30 12:31:42',030595,'2016-03-30 12:31:43',yandex.ru/search/?lr=10738&msid=22901.25826.1459330364.89548&text=%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B+%D0%BE%D0%BD%D0%BB%D0%B0%D0%B9%D0%BD&suggest_reqid=168542624144922467267026838391360&csg=3381%2C3938%2C2%2C3%2C1%2C0%2C0,'2016-03-30 12:31:44','2016-03-30 12:31:45','2016-03-30 12:31:46','2016-03-30 12:31:49',kinogo.co/,'2016-03-30 12:32:11',kinogo.co/melodramy/,0

和另一個(gè)帶有網(wǎng)址的df

url003.ru/[a-zA-Z0-9-_%$#?.:+=|()]+/mobilnyj_telefon_bq_phoenix003.ru/[a-zA-Z0-9-_%$#?.:+=|()]+/mobilnyj_telefon_fly_003.ru/sonyxperia003.ru/[a-zA-Z0-9-_%$#?.:+=|()]+/mobilnye_telefony_smartfony003.ru/[a-zA-Z0-9-_%$#?.:+=|()]+/mobilnye_telefony_smartfony/brands5D5Bbr_231click.ru/sonyxperia1click.ru/[a-zA-Z0-9-_%$#?.:+=|()]+/chasy-motorola

我用

urls = pd.read_csv(’relevant_url1.csv’,error_bad_lines=False)substr = urls.url.values.tolist()data = pd.read_csv(’data_nts2.csv’,error_bad_lines=False,chunksize=50000)result = pd.DataFrame()for i,df in enumerate(data): res = df[df[’event_time’].str.contains(’|’.join(substr),regex=True)]

但它還給我

UserWarning: This pattern has match groups. To actually get the groups,use str.extract.

我該如何解決？

Python 編程

上一條：Python3和hmac。如何處理不是二進(jìn)制的字符串下一條：如何解決錯(cuò)誤“錯(cuò)誤：命令錯(cuò)誤，退出狀態(tài)1：python。” 嘗試使用pip安裝django-heroku時(shí)

排行榜

					
					基于javaweb+jsp實(shí)現(xiàn)學(xué)生宿舍管理系統(tǒng)
多級聯(lián)動下拉選擇框，動態(tài)獲取下一級
如何封裝一個(gè)Ajax函數(shù)
ASP.NET MVC實(shí)現(xiàn)樹形導(dǎo)航菜單
PHP擴(kuò)展之URL編碼、解碼及解析——URLs
Django模板之基本的 for 循環(huán) 和 List內(nèi)容的顯示方式
Laravel Eloquent ORM高級部分解析
jsp response.sendRedirect()用法詳解
Java 接口和抽象類的區(qū)別詳解
Spring security 自定義過濾器實(shí)現(xiàn)Json參數(shù)傳遞并兼容表單參數(shù)(實(shí)例代碼)
Ajax常用封裝庫——Axios的使用