set()
函数在 Python 数据清洗中的应用主要是用于去除重复元素和实现集合运算
- 去除列表或元组中的重复元素:
my_list = [1, 2, 3, 4, 4, 5, 6, 6] unique_list = list(set(my_list)) print(unique_list) # 输出: [1, 2, 3, 4, 5, 6]
- 集合交集(Intersection):
setA = {1, 2, 3, 4} setB = {3, 4, 5, 6} intersection = setA.intersection(setB) print(intersection) # 输出: {3, 4}
- 集合并集(Union):
setA = {1, 2, 3, 4} setB = {3, 4, 5, 6} union = setA.union(setB) print(union) # 输出: {1, 2, 3, 4, 5, 6}
- 集合差集(Difference):
setA = {1, 2, 3, 4} setB = {3, 4, 5, 6} difference = setA.difference(setB) print(difference) # 输出: {1, 2}
- 集合对称差集(Symmetric Difference):
setA = {1, 2, 3, 4} setB = {3, 4, 5, 6} symmetric_difference = setA.symmetric_difference(setB) print(symmetric_difference) # 输出: {1, 2, 5, 6}
通过使用 set()
函数和集合运算,你可以更有效地处理和清洗数据。