我对Ruby的正则表达式有点新鲜,(或者我认为一般都是正则表达式),但我想知道是否有一种实用的方法来匹配使用数组的字符串?
让我解释一下,说我在这种情况下有一份成分清单:
1 1/3 cups all-purpose flour
2 teaspoons ground cinnamon
8 ounces shredded mozzarella cheese
Run Code Online (Sandbox Code Playgroud)
最终我需要将成分分成各自的"数量和量度"和"成分名称",所以就像在这种情况下一样2 teaspoons ground cinnamon,将分成" 8 ounces,和shredded mozzarella cheese.
所以,不要像(cup\w*|teaspoon\w*ounce\w* ....... )我这样有很长的正则表达式,如何使用数组来保存正则表达式之外的那些值?
更新
我这样做了(感谢cwninja):
# I think the all units should be just singular, then
# use ruby function to pluralize them.
units = [
'tablespoon',
'teaspoon',
'cup',
'can',
'quart',
'gallon',
'pinch',
'pound',
'pint',
'fluid ounce',
'ounce'
# ... shortened for brevity
]
joined_units = (units.collect{|u| u.pluralize} + units).join('|')
# There are actually many ingredients, so this is actually an iterator
# but for example sake we are going to just show one.
ingredient = "1 (10 ounce) can diced tomatoes and green chilies, undrained"
ingredient.split(/([\d\/\.\s]+(\([^)]+\))?)\s(#{joined_units})?\s?(.*)/i)
Run Code Online (Sandbox Code Playgroud)
这让我接近我想要的东西,所以我认为这是我想要的方向.
puts "measurement: #{arr[1]}"
puts "unit: #{arr[-2] if arr.size > 3}"
puts "title: #{arr[-1].strip}"
Run Code Online (Sandbox Code Playgroud)
cwn*_*nja 28
就个人而言,我只是以编程方式构建正则表达式,你可以这样做:
ingredients = [...]
recipe = Regexp(ingredients.join("|"), true) # Case-insensitive
Run Code Online (Sandbox Code Playgroud)
或使用union方法:
recipe = Regexp.union(ingredients)
recipe = /#{regex}/i
Run Code Online (Sandbox Code Playgroud)
...然后使用正则recipe表达式.
只要你保存它并且不再继续它,它应该是相当有效的.
| 归档时间: |
|
| 查看次数: |
16701 次 |
| 最近记录: |