如何从网址中删除Google跟踪参数(UTM)?

myh*_*yhd 5 ruby google-analytics

我有一堆网址,我想清理.它们都包含UTM参数,这些参数不是必需的,或者在这种情况下是有害的.例:

http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29
Run Code Online (Sandbox Code Playgroud)

所有潜在参数都以utm_.如何在不破坏其他潜在的"好"URL参数的情况下使用ruby脚本/结构轻松删除它们?

ste*_*lag 11

这使用URI lib来解构和更改查询字符串(没有正则表达式):

require 'uri'
str ='http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'

uri = URI.parse(str)
clean_key_vals = URI.decode_www_form(uri.query).reject{|k, _| k.start_with?('utm_')}
uri.query = URI.encode_www_form(clean_key_vals)
p uri.to_s #=> "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"
Run Code Online (Sandbox Code Playgroud)

  • +1可利用专为这项工作设计的工具。 (2认同)

Boz*_*sov 8

您可以将正则表达式应用于网址以进行清理.像这样的东西应该做的伎俩:

url = 'http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'
url.gsub(/&?utm_.+?(&|$)/, '') => "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"
Run Code Online (Sandbox Code Playgroud)