我有一个字符串列表:
In [53]: l = ['#Trending', '#Trending', '#TrendinG', '#Yax', '#YAX', '#Yax']
In [54]: set(l)
Out[54]: {'#TrendinG', '#Trending', '#YAX', '#Yax'}
Run Code Online (Sandbox Code Playgroud)
我想对此列表不区分大小写 set.
预期结果:
Out[55]: {'#Trending', '#Yax'}
Run Code Online (Sandbox Code Playgroud)
我怎样才能做到这一点?
Mar*_*ers 14
如果您需要保留大小写,则可以使用字典.小写键,然后提取值:
set({v.casefold(): v for v in l}.values())
Run Code Online (Sandbox Code Playgroud)
您可以将其封装到一个类中.
如果您不关心保留大小写,只需使用一组:
>>> print(s := 'Wa??er?chloß', s.lower(), s.casefold(), sep=" - ")
Wa??er?chloß - wa??er?chloß - wasserschloss
Run Code Online (Sandbox Code Playgroud)
演示:
{v.casefold() for v in l}
Run Code Online (Sandbox Code Playgroud)
将第一种方法包装到类中看起来像:
>>> l = ['#Trending', '#Trending', '#TrendinG', '#Yax', '#YAX', '#Yax']
>>> set({v.casefold(): v for v in l}.values())
{'#Yax', '#TrendinG'}
>>> {v.lower() for v in l}
{'#trending', '#yax'}
Run Code Online (Sandbox Code Playgroud)
用法演示:
try:
# Python 3
from collections.abc import MutableSet
except ImportError:
# Python 2
from collections import MutableSet
class CasePreservingSet(MutableSet):
def __init__(self, *values):
self._values = {}
try:
self._fold = str.casefold # Python 3
except AttributeError:
self._fold = str.lower # Python 2
for v in values:
self.add(v)
def __repr__(self):
return '<{}{} at {:x}>'.format(
type(self).__name__, tuple(self._values.values()), id(self))
def __contains__(self, value):
return self._fold(value) in self._values
def __iter__(self):
try:
# Python 2
return self._values.itervalues()
except AttributeError:
# Python 3
return iter(self._values.values())
def __len__(self):
return len(self._values)
def add(self, value):
self._values[self._fold(value)] = value
def discard(self, value):
try:
del self._values[self._fold(value)]
except KeyError:
pass
Run Code Online (Sandbox Code Playgroud)