What are the options to clone or copy a list in Python?
Using new_list = my_list
then modifies new_list
every time my_list
changes.
Why is this?
转载于:https://stackoverflow.com/questions/2612802/how-to-clone-or-copy-a-list
With new_list = my_list
, you don't actually have two lists. The assignment just copies the reference to the list, not the actual list, so both new_list
and my_list
refer to the same list after the assignment.
To actually copy the list, you have various possibilities:
You can use the builtin list.copy()
method (available since python 3.3):
new_list = old_list.copy()
You can slice it:
new_list = old_list[:]
Alex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).
You can use the built in list()
function:
new_list = list(old_list)
You can use generic copy.copy()
:
import copy
new_list = copy.copy(old_list)
This is a little slower than list()
because it has to find out the datatype of old_list
first.
If the list contains objects and you want to copy them as well, use generic copy.deepcopy()
:
import copy
new_list = copy.deepcopy(old_list)
Obviously the slowest and most memory-needing method, but sometimes unavoidable.
Example:
import copy
class Foo(object):
def __init__(self, val):
self.val = val
def __repr__(self):
return str(self.val)
foo = Foo(1)
a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)
# edit orignal list and instance
a.append('baz')
foo.val = 5
print('original: %r\n list.copy(): %r\n slice: %r\n list(): %r\n copy: %r\n deepcopy: %r'
% (a, b, c, d, e, f))
Result:
original: ['foo', 5, 'baz']
list.copy(): ['foo', 5]
slice: ['foo', 5]
list(): ['foo', 5]
copy: ['foo', 5]
deepcopy: ['foo', 1]
Use thing[:]
>>> a = [1,2]
>>> b = a[:]
>>> a += [3]
>>> a
[1, 2, 3]
>>> b
[1, 2]
>>>
Python's idiom for doing this is newList = oldList[:]
new_list = list(old_list)
Felix already provided an excellent answer, but I thought I'd do a speed comparison of the various methods:
copy.deepcopy(old_list)
Copy()
method copying classes with deepcopyCopy()
method not copying classes (only dicts/lists/tuples)for item in old_list: new_list.append(item)
[i for i in old_list]
(a list comprehension)copy.copy(old_list)
list(old_list)
new_list = []; new_list.extend(old_list)
old_list[:]
(list slicing)So the fastest is list slicing. But be aware that copy.copy()
, list[:]
and list(list)
, unlike copy.deepcopy()
and the python version don't copy any lists, dictionaries and class instances in the list, so if the originals change, they will change in the copied list too and vice versa.
(Here's the script if anyone's interested or wants to raise any issues:)
from copy import deepcopy
class old_class:
def __init__(self):
self.blah = 'blah'
class new_class(object):
def __init__(self):
self.blah = 'blah'
dignore = {str: None, unicode: None, int: None, type(None): None}
def Copy(obj, use_deepcopy=True):
t = type(obj)
if t in (list, tuple):
if t == tuple:
# Convert to a list if a tuple to
# allow assigning to when copying
is_tuple = True
obj = list(obj)
else:
# Otherwise just do a quick slice copy
obj = obj[:]
is_tuple = False
# Copy each item recursively
for x in xrange(len(obj)):
if type(obj[x]) in dignore:
continue
obj[x] = Copy(obj[x], use_deepcopy)
if is_tuple:
# Convert back into a tuple again
obj = tuple(obj)
elif t == dict:
# Use the fast shallow dict copy() method and copy any
# values which aren't immutable (like lists, dicts etc)
obj = obj.copy()
for k in obj:
if type(obj[k]) in dignore:
continue
obj[k] = Copy(obj[k], use_deepcopy)
elif t in dignore:
# Numeric or string/unicode?
# It's immutable, so ignore it!
pass
elif use_deepcopy:
obj = deepcopy(obj)
return obj
if __name__ == '__main__':
import copy
from time import time
num_times = 100000
L = [None, 'blah', 1, 543.4532,
['foo'], ('bar',), {'blah': 'blah'},
old_class(), new_class()]
t = time()
for i in xrange(num_times):
Copy(L)
print 'Custom Copy:', time()-t
t = time()
for i in xrange(num_times):
Copy(L, use_deepcopy=False)
print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t
t = time()
for i in xrange(num_times):
copy.copy(L)
print 'copy.copy:', time()-t
t = time()
for i in xrange(num_times):
copy.deepcopy(L)
print 'copy.deepcopy:', time()-t
t = time()
for i in xrange(num_times):
L[:]
print 'list slicing [:]:', time()-t
t = time()
for i in xrange(num_times):
list(L)
print 'list(L):', time()-t
t = time()
for i in xrange(num_times):
[i for i in L]
print 'list expression(L):', time()-t
t = time()
for i in xrange(num_times):
a = []
a.extend(L)
print 'list extend:', time()-t
t = time()
for i in xrange(num_times):
a = []
for y in L:
a.append(y)
print 'list append:', time()-t
t = time()
for i in xrange(num_times):
a = []
a.extend(i for i in L)
print 'generator expression extend:', time()-t
EDIT: Added new-style, old-style classes and dicts to the benchmarks, and made the python version much faster and added some more methods including list expressions and extend()
.
I've been told that Python 3.3+ adds list.copy()
method, which should be as fast as slicing:
newlist = old_list.copy()
What are the options to clone or copy a list in Python?
In Python 3, a shallow copy can be made with:
a_copy = a_list.copy()
In Python 2 and 3, you can get a shallow copy with a full slice of the original:
a_copy = a_list[:]
There are two semantic ways to copy a list. A shallow copy creates a new list of the same objects, a deep copy creates a new list containing new equivalent objects.
A shallow copy only copies the list itself, which is a container of references to the objects in the list. If the objects contained themselves are mutable and one is changed, the change will be reflected in both lists.
There are different ways to do this in Python 2 and 3. The Python 2 ways will also work in Python 3.
In Python 2, the idiomatic way of making a shallow copy of a list is with a complete slice of the original:
a_copy = a_list[:]
You can also accomplish the same thing by passing the list through the list constructor,
a_copy = list(a_list)
but using the constructor is less efficient:
>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844
In Python 3, lists get the list.copy
method:
a_copy = a_list.copy()
In Python 3.5:
>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125
Using new_list = my_list then modifies new_list every time my_list changes. Why is this?
my_list
is just a name that points to the actual list in memory. When you say new_list = my_list
you're not making a copy, you're just adding another name that points at that original list in memory. We can have similar issues when we make copies of lists.
>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]
The list is just an array of pointers to the contents, so a shallow copy just copies the pointers, and so you have two different lists, but they have the same contents. To make copies of the contents, you need a deep copy.
To make a deep copy of a list, in Python 2 or 3, use deepcopy
in the copy
module:
import copy
a_deep_copy = copy.deepcopy(a_list)
To demonstrate how this allows us to make new sub-lists:
>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]
And so we see that the deep copied list is an entirely different list from the original. You could roll your own function - but don't. You're likely to create bugs you otherwise wouldn't have by using the standard library's deepcopy function.
eval
You may see this used as a way to deepcopy, but don't do it:
problematic_deep_copy = eval(repr(a_list))
In 64 bit Python 2.7:
>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206
on 64 bit Python 3.5:
>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644
There are many answers already that tell you how to make a proper copy, but none of them say why your original 'copy' failed.
Python doesn't store values in variables; it binds names to objects. Your original assignment took the object referred to by my_list
and bound it to new_list
as well. No matter which name you use there is still only one list, so changes made when referring to it as my_list
will persist when referring to it as new_list
. Each of the other answers to this question give you different ways of creating a new object to bind to new_list
.
Each element of a list acts like a name, in that each element binds non-exclusively to an object. A shallow copy creates a new list whose elements bind to the same objects as before.
new_list = list(my_list) # or my_list[:], but I prefer this syntax
# is simply a shorter way of:
new_list = [element for element in my_list]
To take your list copy one step further, copy each object that your list refers to, and bind those element copies to a new list.
import copy
# each element must have __copy__ defined for this...
new_list = [copy.copy(element) for element in my_list]
This is not yet a deep copy, because each element of a list may refer to other objects, just like the list is bound to its elements. To recursively copy every element in the list, and then each other object referred to by each element, and so on: perform a deep copy.
import copy
# each element must have __deepcopy__ defined for this...
new_list = copy.deepcopy(my_list)
See the documentation for more information about corner cases in copying.
All of the other contributors gave great answers, which work when you have a single dimension (leveled) list, however of the methods mentioned so far, only copy.deepcopy()
works to clone/copy a list and not have it point to the nested list
objects when you are working with multidimensional, nested lists (list of lists). While Felix Kling refers to it in his answer, there is a little bit more to the issue and possibly a workaround using built-ins that might prove a faster alternative to deepcopy
.
While new_list = old_list[:]
, copy.copy(old_list)'
and for Py3k old_list.copy()
work for single-leveled lists, they revert to pointing at the list
objects nested within the old_list
and the new_list
, and changes to one of the list
objects are perpetuated in the other.
As was pointed out by both Aaron Hall and PM 2Ring using
eval()
is not only a bad idea, it is also much slower thancopy.deepcopy()
.This means that for multidimensional lists, the only option is
copy.deepcopy()
. With that being said, it really isn't an option as the performance goes way south when you try to use it on a moderately sized multidimensional array. I tried totimeit
using a 42x42 array, not unheard of or even that large for bioinformatics applications, and I gave up on waiting for a response and just started typing my edit to this post.It would seem that the only real option then is to initialize multiple lists and work on them independently. If anyone has any other suggestions, for how to handle multidimensional list copying, it would be appreciated.
As others have stated, there can be are significant performance issues using the copy
module and copy.deepcopy
for multidimensional lists. Trying to work out a different way of copying the multidimensional list without using deepcopy
, (I was working on a problem for a course that only allows 5 seconds for the entire algorithm to run in order to receive credit), I came up with a way of using built-in functions to make a copy of the nested list without having them point at one another or at the list
objects nested within them. I used eval()
and repr()
in the assignment to make the copy of the old list into the new list without creating a link to the old list. It takes the form of:
new_list = eval(repr(old_list))
Basically what this does is make a representation of old_list
as a string and then evaluates the string as if it were the object that the string represents. By doing this, no link to the original list
object is made. A new list
object is created and each variable points to its own independent object. Here is an example using a 2 dimensional nested list.
old_list = [[0 for j in range(y)] for i in range(x)] # initialize (x,y) nested list
# assign a copy of old_list to new list without them pointing to the same list object
new_list = eval(repr(old_list))
# make a change to new_list
for j in range(y):
for i in range(x):
new_list[i][j] += 1
If you then check the contents of each list, for example a 4 by 3 list, Python will return
>>> new_list
[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]]
>>> old_list
[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
While this probably isn't the canonical or syntactically correct way to do it, it seems to work well. I haven't tested performance, but I am going to guess that eval()
and rep()
will have less overhead to run than deepcopy
will.
Unlike other languages that have variable and value, Python has name and object.
This statement:
a = [1,2,3]
means to give the list (object) a name a
, and, this:
b = a
just gives the same object a
a new name b
, so whenever you do something with a
, the object changes and therefore b
changes.
The only way to make a really copy of a is to create a new object like other answers already have said.
You can see more about this here.
Here are the timing results using Python 3.6.0. Keep in mind these times are relative to one another, not absolute.
I stuck to only doing shallow copies, and also added some new methods that weren't possible in Python2, such as list.copy()
(the Python3 slice equivalent) and list unpacking (*new_list, = list
):
METHOD TIME TAKEN
b = a[:] 6.468942025996512 #Python2 winner
b = a.copy() 6.986593422974693 #Python3 "slice equivalent"
b = []; b.extend(a) 7.309216841997113
b = a[0:len(a)] 10.916740721993847
*b, = a 11.046738261007704
b = list(a) 11.761539687984623
b = [i for i in a] 24.66165203397395
b = copy.copy(a) 30.853400873980718
b = []
for item in a:
b.append(item) 48.19176080400939
We can see the old winner still comes out on top, but not really by a huge amount, considering the increased readability of the Python3 list.copy()
approach.
Note that these methods do not output equivalent results for any input other than lists. They all work for sliceable objects, a few work for any iterable, but only copy.copy()
works for any Python object.
Here is the testing code for interested parties (Template from here):
import timeit
COUNT = 50000000
print("Array duplicating. Tests run", COUNT, "times")
setup = 'a = [0,1,2,3,4,5,6,7,8,9]; import copy'
print("b = list(a)\t\t", timeit.timeit(stmt='b = list(a)', setup=setup, number=COUNT))
print("b = copy.copy(a)\t\t", timeit.timeit(stmt='b = copy.copy(a)', setup=setup, number=COUNT))
print("b = a.copy()\t\t", timeit.timeit(stmt='b = a.copy()', setup=setup, number=COUNT))
print("b = a[:]\t\t", timeit.timeit(stmt='b = a[:]', setup=setup, number=COUNT))
print("b = a[0:len(a)]\t", timeit.timeit(stmt='b = a[0:len(a)]', setup=setup, number=COUNT))
print("*b, = a\t", timeit.timeit(stmt='*b, = a', setup=setup, number=COUNT))
print("b = []; b.extend(a)\t", timeit.timeit(stmt='b = []; b.extend(a)', setup=setup, number=COUNT))
print("b = []\nfor item in a: b.append(item)\t", timeit.timeit(stmt='b = []\nfor item in a: b.append(item)', setup=setup, number=COUNT))
print("b = [i for i in a]\t", timeit.timeit(stmt='b = [i for i in a]', setup=setup, number=COUNT))
new_list = my_list[:]
new_list = my_list
Try to understand this. Let's say that my_list is in the heap memory at location X i.e. my_list is pointing to the X. Now by assigning new_list = my_list
you're Letting new_list pointing to the X. This is known as shallow Copy.
Now if you assign new_list = my_list[:]
You're simply copying each object of my_list to new_list. This is known as Deep copy.
The Other way you can do this are :
new_list = list(old_list)
import copy new_list = copy.deepcopy(old_list)
Not sure if this is still actual, but the same behavior holds for dictionaries as well. Look at this example.
a = {'par' : [1,21,3], 'sar' : [5,6,8]}
b = a
c = a.copy()
a['har'] = [1,2,3]
a
Out[14]: {'har': [1, 2, 3], 'par': [1, 21, 3], 'sar': [5, 6, 8]}
b
Out[15]: {'har': [1, 2, 3], 'par': [1, 21, 3], 'sar': [5, 6, 8]}
c
Out[16]: {'par': [1, 21, 3], 'sar': [5, 6, 8]}
A very simple approach independent of python version was missing in already given answers which you can use most of the time (at least I do):
new_list = my_list * 1 #Solution 1 when you are not using nested lists
However, If my_list contains other containers (for eg. nested lists) you must use deepcopy as others suggested in the answers above from the copy library. For example:
import copy
new_list = copy.deepcopy(my_list) #Solution 2 when you are using nested lists
.Bonus: If you don't want to copy elements use (aka shallow copy):
new_list = my_list[:]
Let's understand difference between Solution#1 and Solution #2
>>> a = range(5)
>>> b = a*1
>>> a,b
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
>>> a[2] = 55
>>> a,b
([0, 1, 55, 3, 4], [0, 1, 2, 3, 4])
As you can see Solution #1 worked perfectly when we were not using the nested lists. Let's check what will happen when we apply solution #1 to nested lists.
>>> from copy import deepcopy
>>> a = [range(i,i+4) for i in range(3)]
>>> a
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> b = a*1
>>> c = deepcopy(a)
>>> for i in (a, b, c): print i
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> a[2].append('99')
>>> for i in (a, b, c): print i
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]] #Solution#1 didn't work in nested list
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]] #Solution #2 - DeepCopy worked in nested list
Let's start from the beginning and explorer it a little deep :
So Suppose you have two list :
list_1=['01','98']
list_2=[['01','98']]
And we have to copy both list , now starting from the first list:
So first let's try by general method of copy:
copy=list_1
Now if you are thinking copy copied the list_1 then you can be wrong, let's check it:
The id() function shows us that both variables point to the same list object, i.e. they share this object.
print(id(copy))
print(id(list_1))
output:
4329485320
4329485320
Surprised ? Ok let's explore it:
So as we know python doesn't store anything in a variable, Variables are just referencing to the object and object store the value. Here object is list
but we created two references to that same object by two different variable names. So both variables are pointing to the same object :
so when you do copy=list_1
what actually its doing :
Here in the image list_1 and copy are two variable names but the object is same for both variable which is list
So if you try to modify copied list then it will modify the original list too because the list is only one there, you will modify that list no matter you do from the copied list or from the original list:
copy[0]="modify"
print(copy)
print(list_1)
output:
['modify', '98']
['modify', '98']
So it modified the original list :
What is the solution then?
Solution :
Now let's move to a second pythonic method of copying list:
copy_1=list_1[:]
Now this method fix the thing what we were facing in first issue let's check it :
print(id(copy_1))
print(id(list_1))
4338792136
4338791432
So as we can see our both list having different id and it means both variables are pointing to different objects so what actually going on here is :
Now let's try to modify the list and let's see if we still face the previous problem :
copy_1[0]="modify"
print(list_1)
print(copy_1)
Output:
['01', '98']
['modify', '98']
So as you can see it is not modifying the original list, it only modified the copied list, So we are ok with it.
So now i think we are done? wait we have to copy the second nested list too so let's try pythonic way :
copy_2=list_2[:]
So list_2 should reference to another object which is copy of list_2 let's check:
print(id((list_2)),id(copy_2))
we get the output:
4330403592 4330403528
Now we can assume both lists are pointing different object so now let's try to modify it and let's see it is giving what we want :
So when we try:
copy_2[0][1]="modify"
print(list_2,copy_2)
it gives us output:
[['01', 'modify']] [['01', 'modify']]
Now, this is little confusing we used the pythonic way and still, we are facing the same issue.
let's understand it:
So when we do :
copy_2=list_2[:]
we are actually copying the outer list only, not the nested list, so nested list is same object for both list, let's check:
print(id(copy_2[0]))
print(id(list_2[0]))
output:
4329485832
4329485832
So actually when we do copy_2=list_2[:]
this is what happens:
It creates the copy of list but only outer list copy, not the nested list copy, nested list is same for both variable so if you try to modify the nested list then it will modify the original list too because nested list object is same for both nested list.
So what is the solution?
Solution is deep copy
from copy import deepcopy
deep=deepcopy(list_2)
So now let's check it :
print(id((list_2)),id(deep))
output:
4322146056 4322148040
both id are different , now let's check nested list id:
print(id(deep[0]))
print(id(list_2[0]))
output:
4322145992
4322145800
As you can see both id are different so we can assume that both nested list are pointing different object now.
So when you do deep=deepcopy(list_2)
what actually happens :
So both nested list are pointing different object and they have seprate copy of nested list now.
Now let's try to modify the nested list and let's see if it solved the previous issue or not:
so if we do :
deep[0][1]="modify"
print(list_2,deep)
output:
[['01', '98']] [['01', 'modify']]
So as you can see it didn't modify the original nested list , it only modified the copied list.
If you like my detailed answer , let me know by upvoting it , if you have any doubt realted this answer , comment down :)
you can use bulit in list() function:
newlist=list(oldlist)
i think this code will help you.
It surprises me that this hasn't been mentioned yet, so for the sake of completeness...
You can perform list unpacking with the "splat operator": *
, which will also copy elements of your list.
old_list = [1, 2, 3]
new_list = [*old_list]
new_list.append(4)
old_list == [1, 2, 3]
new_list == [1, 2, 3, 4]
The obvious downside to this method is that it is only available in Python 3.5+.
Timing wise though, this appears to perform better than other common methods.
x = [random.random() for _ in range(1000)]
%timeit a = list(x)
%timeit a = x.copy()
%timeit a = x[:]
%timeit a = [*x]
#: 2.47 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.47 µs ± 54.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.39 µs ± 58.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.22 µs ± 43.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Note that there are some cases where if you have defined your own custom class and you want to keep the attributes then you should use copy.copy()
or copy.deepcopy()
rather than the alternatives, for example in Python 3:
import copy
class MyList(list):
pass
lst = MyList([1,2,3])
lst.name = 'custom list'
d = {
'original': lst,
'slicecopy' : lst[:],
'lstcopy' : lst.copy(),
'copycopy': copy.copy(lst),
'deepcopy': copy.deepcopy(lst)
}
for k,v in d.items():
print('lst: {}'.format(k), end=', ')
try:
name = v.name
except AttributeError:
name = 'NA'
print('name: {}'.format(name))
Outputs:
lst: original, name: custom list
lst: slicecopy, name: NA
lst: lstcopy, name: NA
lst: copycopy, name: custom list
lst: deepcopy, name: custom list