groupby大致上可以達成這些事情
[k for k, g in groupby('AAAABBBCCDAABBB')] #--> A B C D A B
[list(g) for k, g in groupby('AAAABBBCCD')] #--> AAAA BBB CC D
k
就是相同連續字串的單個字元,g
就會是整串文字
更多groupby
詳細說明可以參考這篇,或參考官方文件,這方法不僅可以用在字串,還可以用在很多地方
___________________________________________
但有一個新的問題:g
是一個iterator
沒有len()
可以用,一般使用 len(tuple(g))
,把g實體化就可以算長度
另外一種是參考 more-itertools.ilen 的方法,就可以輕鬆算出長度了
def ilen(iterable):
"""Return the number of items in *iterable*.
>>> ilen(x for x in range(1000000) if x % 3 == 0)
333334
This consumes the iterable, so handle with care.
"""
# This approach was selected because benchmarks showed it's likely the
# fastest of the known implementations at the time of writing.
# See GitHub tracker: #236, #230.
counter = count()
deque(zip(iterable, counter), maxlen=0)
return next(counter)