drop_duplicate(subset = 【'a','b'】)
这样表示就是代表如果a,b两列中某行的元素相同,才去重吗?
求这个函数使用的详解,搜了半天找不到很清晰的
drop_duplicates参数就四个,subset,keep,inplace,ignore_index
这里表示如果有两行的a和b都相同,才会去重
Help on method drop_duplicates in module pandas.core.frame:
drop_duplicates(subset=None, keep='first', inplace=False) method of pandas.core.frame.DataFrame instance
Return DataFrame with duplicate rows removed, optionally only
considering certain columns.
返回去除重复行的DataFrame实例,可选择只考虑给定的列
Parameters
----------
subset : column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by
default use all of the columns
列标签或是列标签的序列,可选择给定的一列或多列作为去重标准(也就是多列全部相同才去重),默认使用所有列(也就是两行所有数据相同才去重)
keep : {'first', 'last', False}, default 'first'
- first
: Drop duplicates except for the first occurrence.
- last
: Drop duplicates except for the last occurrence.
- False : Drop all duplicates.
inplace : boolean, default False
Whether to drop duplicates in place or to return a copy
Returns
-------
deduplicated : DataFrame