Python数据结构

发布于 2017-07-29 · 本文总共 6419 字 · 阅读大约需要 19 分钟

tuple

元组是 immutable （不可变的），一般可包含异质元素序列，通过解包或索引访问

>>> tuple1 = (1,2)
>>> tuple2 = (2,3)
>>> tuple1 + tuple2 + (1,)
(1, 2, 2, 3, 1)

set

集合是由不重复元素组成的无序容器; 基本用法包括成员检测、消除重复元素; 集合对象支持合集、交集、差集、对称差分等数学运算;

s = set([1, 2, 3])
s.add(4)
s.remove(4)

union（联合），intersection（交），difference（差）和 sysmmetric difference（对称差集）

a = set('abcdefg')
b = set('xyz')
a - b 
a | b 
a & b 
a ^ b # in a or b but not both

>>> sub1 = {2,3}
>>> sub = {1,2}

>>> sub | sub1
set([1, 2, 3])

>>> sub & sub1
set([2])

>>> sub - sub1
set([1])

>>> sub ^ sub1
set([1, 3])

集合推导式

集合推导式语法

a = {x for x in 'abracadabra' if x not in 'abc'}

集合操作示例

集合操作示例：

递增子序列:

输入: [4, 6, 7, 7]

输出: [[4, 6], [4, 7], [4, 6, 7], [4, 6, 7, 7], [6, 7], [6, 7, 7], [7,7], [4,7,7]]

class Solution(object):
    def findSubsequences(self, nums):
        """
        :type nums: List[int]
        :rtype: List[List[int]]
        """
        subs = {()}

        for i in range(len(nums)):
            subs |= {sub + (nums[i],) for sub in subs if not sub or sub[-1] <= nums[i]}

        return [sub for sub in subs if len(sub) >= 2]

list

常用方法

list.append(x) 在列表末尾添加一个元素，相当于 a[len(a):] = [x] ；

list.extend(iterable) 用可迭代对象的元素扩展列表。相当于 a[len(a):] = iterable ；

list.insert(i, x) 在指定位置插入元素。第一个参数是插入元素的索引，因此，a.insert(0, x) 在列表开头插入元素， a.insert(len(a), x) 等同于 a.append(x)；

list.remove(x) 从列表中删除第一个值为 x 的元素。未找到指定元素时，触发 ValueError 异常；

list.pop([i]) 删除列表中指定位置的元素，并返回被删除的元素。未指定位置时，a.pop() 删除并返回列表的最后一个元素；

list.clear() 删除列表里的所有元素，相当于 del a[:] ；

list.index(x[, start[, end]]) 返回列表中第一个值为 x 的元素的零基索引。未找到指定元素时，触发 ValueError 异常；

list.count(x) 返回列表中元素 x 出现的次数；

list.sort(*, key=None, reverse=False) 就地排序列表中的元素；

list.reverse() 反转列表中的元素；

list.copy() 返回列表的浅拷贝；

列表用作堆栈

入栈：list.append(x) 添加一个元素到堆栈的顶端;

出栈：list.pop() 从堆栈顶部取出一个元素;

>>> stack = [1,2,3]
>>> stack.append(4)
>>> stack.append(5)
>>> stack
[1, 2, 3, 4, 5]
>>> 
>>> 
>>> stack.pop()
5
>>> stack
[1, 2, 3, 4]

列表作为队列使用

在列表开头插入或移除元素却很慢（因为所有其他元素都必须移动一位）;

使用collections.deque

>>> from collections import deque
>>> 
>>> queue = deque([1,2,3,4,5,6])
>>> 
>>> queue.append(100)
>>> 
>>> queue
deque([1, 2, 3, 4, 5, 6, 100])
>>> 
>>> queue.popleft()
1
>>> 
>>> queue
deque([2, 3, 4, 5, 6, 100])

列表推导式

>>> squares = [i*i for i in range(10)]
>>> 
>>> 
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> 
>>> 
>>> squires = list(map(lambda x: x*x, range(10)))
>>> 
>>> squires
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

dict

创建dict

dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
{'sape': 4139, 'jack': 4098, 'guido': 4127}

 dict(sape=4139, guido=4127, jack=4098)
{'sape': 4139, 'jack': 4098, 'guido': 4127}

字典推导式

{x: x**2 for x in (2, 4, 6)}

常见字典方法：

Method	Description
clear()	Removes all items from the dictionary.
copy()	Returns a shallow copy of the dictionary.
fromkeys(seq[, v])	Returns a new dictionary with keys from seq and value equal to v (defaults to None).
get(key[,d])	Returns the value of the key. If the key does not exist, returns d (defaults to None).
items()	Return a new object of the dictionary’s items in (key, value) format.
keys()	Returns a new object of the dictionary’s keys.
pop(key[,d])	Removes the item with the key and returns its value or d if key is not found. If d is not provided and the key is not found, it raises KeyError.
popitem()	Removes and returns an arbitrary item (key, value). Raises KeyError if the dictionary is empty.
setdefault(key[,d])	Returns the corresponding value if the key is in the dictionary. If not, inserts the key with a value of d and returns d (defaults to None).
update([other])	Updates the dictionary with the key/value pairs from other, overwriting existing keys.
values()	Returns a new object of the dictionary’s values

内置方法：

Function	Description
all()	Return True if all keys of the dictionary are True (or if the dictionary is empty).
any()	Return True if any key of the dictionary is true. If the dictionary is empty, return False.
len()	Return the length (the number of items) in the dictionary.
cmp()	Compares items of two dictionaries. (Not available in Python 3)
sorted()	Return a new sorted list of keys in the dictionary.

del

>>> l = [0,1,2,3,4,5,6,7]
>>> del l[0]
>>> l
[1, 2, 3, 4, 5, 6, 7]
>>> del l[2:5]
>>> l
[1, 2, 6, 7]
>>> del l[:]
>>> 
>>> l
[]
>>> del l
>>> l
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'l' is not defined

保存最后N个元素

q = collections.deque(maxlen=his)
q.append(line)

collections.deque

    q.popleft()
    q.pop()
    q.append()
    q.appendleft()

找到最大或者最小的N个元素

    h = [1,9,8,4,1]
    import heapq
    heapq.nlargest(3, h)
    heapq.nsmallest(3, h)

实现优先级队列

1.heapq入堆会排好序； 2.heapq的item需要保证可排序（a < b）,所以加index

import heapq

class PriorityQueue():
    def __init__(self):
        self.queue = []
        self.index = 0
        
    def push(self, item, priority):
        heapq.heappush(self.queue, (-priority, self.index, item))
        self.index += 1
        
    def pop(self):
        return heapq.heappop(self.queue)[-1]

一键多值的字典

collections.defaultdict(type)

有序字典

ordered_dict = collections.OrderedDict()

底层是双向链表，先入后出

>>> ordered_dict = collections.OrderedDict()

>>> ordered_dict["a"] = 1
>>> ordered_dict["b"] = 2
>>> ordered_dict.popitem()
('b', 2)

两个字典寻找相同映射

d1 = dict(…) d2 = dict(…) d1.keys() & d2.keys() d1.keys() - d2.keys()

统计出现次数

>>> l = [1,2,3,4,1,4,5,6,1,2,3,4,5,6,7]
>>> d = collections.Counter(l)
>>> print(d)
Counter({1: 3, 4: 3, 2: 2, 3: 2, 5: 2, 6: 2, 7: 1})
>>> d.most_common(1)
[(1, 3)]

>>> d1 = collections.Counter(l)
>>> d + d1
Counter({1: 6, 4: 6, 2: 4, 3: 4, 5: 4, 6: 4, 7: 2})
>>> d.update(l)
>>> print(d)
Counter({1: 6, 4: 6, 2: 4, 3: 4, 5: 4, 6: 4, 7: 2})

将名称映射到list的元素中

namedtuple比用dict更高效

    from collections import namedtuple
    Students = namedtuple("Student", ["name", "age"])
    s1 = Students("qwq", 20)
    print(s1, s1.name, s1.age)

同时对数据做转换和换算

在函数中使用生成器和表达式

sum(i**2 for i in range(10))

多个字典合并–Chain（Python3）

a = {"a":1, "b": 2}
b = {"b": 3, "c": 4}
from collections import ChainMap
merged = ChainMap(a, b)
print(merged["b"])
a["b"] = 100
print(merged["b"])

Sorted Containers

>>> import sortedcontainers
>>> help(sortedcontainers)
>>> from sortedcontainers import SortedDict
>>> help(SortedDict)
>>> help(SortedDict.popitem)

>>> from sortedcontainers import SortedList
>>> sl = SortedList(['e', 'a', 'c', 'd', 'b'])
>>> sl
SortedList(['a', 'b', 'c', 'd', 'e'])
>>> sl *= 1000000
>>> sl.count('c')
1000000
>>> sl[-3:]
['e', 'e', 'e']
>>> from sortedcontainers import SortedDict
>>> sd = SortedDict({'c': 3, 'a': 1, 'b': 2})
>>> sd
SortedDict({'a': 1, 'b': 2, 'c': 3})
>>> sd.popitem(index=-1)
('c', 3)
>>> from sortedcontainers import SortedSet
>>> ss = SortedSet('abracadabra')
>>> ss
SortedSet(['a', 'b', 'c', 'd', 'r'])
>>> ss.bisect_left('c')
2

refs

https://docs.python.org/

http://www.grantjenks.com/docs/sortedcontainers/index.html

https://docs.python.org/zh-cn/3/library/collections.html

https://docs.python.org/zh-cn/3/tutorial/stdlib.html