小编Kev*_*eel的帖子

如何使用正则表达式删除主题标签,@ user,链接

我需要使用Python预处理推文.现在我想知道什么是正则表达式分别删除所有的标签,@ user和推文的链接?

例如,

  1. original tweet: @peter I really love that shirt at #Macy. http://bet.ly//WjdiW4
    • 处理过的推文: I really love that shirt at Macy

  2. 原始推文: @shawn Titanic tragedy could have been prevented Economic Times: Telegraph.co.ukTitanic tragedy could have been preve... http://bet.ly/tuN2wx

    • 处理过的推文: Titanic tragedy could have been prevented Economic Times Telegraph co ukTitanic tragedy could have been preve
  3. 原始推文: I am at Starbucks http://4sh.com/samqUI (7419 3rd ave, at 75th, Brooklyn)
    • 处理过的推文: I am at Starbucks 7419 3rd ave at 75th Brooklyn

我只需要每条推文中有意义的单词.我不需要用户名,或任何链接或任何标点符号.

python regex twitter

12
推荐指数
2
解决办法
3万
查看次数

标签 统计

python ×1

regex ×1

twitter ×1