AppDividend
Latest Code Tutorials

Pandas DataFrame Copy: How to Copy DataFrame using df.copy()

0

Pandas copy() function is used to create a copy of the Pandas object. Variables are also used to generate a copy of the object. Still, variables are just pointer to an object, and any change in new data will also change the previous data.

Pandas DataFrame Copy

To copy Pandas DataFrame, use the copy() method. The DataFrame.copy() method makes a copy of the provided object’s indices and data. The copy() method accepts one parameter called deep, and it returns the Series or DataFrame that matches the caller.

Syntax

DataFrame.copy(deep=True)

Parameters

deep: bool, default True.

When deep=True (default), the new object will be generated with a copy of a calling object’s data and indices. Changes to the data or indices of the copy will not be flashed in the original object.

When deep=False, the new object will be generated without copying the calling object’s data or index (only references to the data and Index are copied). Any modifications to the data of the original will be followed in the shallow copy (and vice versa).

Return Value

The copy() method returns the Series or DataFrame.

Example

Write the following code inside the app.py file.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Pedro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
print('Original DataFrame')
print(df)
print('----------------------------------------------------')
dfCopy = df.copy()
print('Copied DataFrame')
print(dfCopy)

Output

Original DataFrame
              Show     Streaming  Season  Main Actor
0  Stranger Things       Netflix       3      Millie
1      The X-Files            Fx      12     Gillian
2      Mandalorian   Disney Plus       1       Pedro
3         The Boys  Amazon Prime       2  Karl Urban
----------------------------------------------------
Copied DataFrame
              Show     Streaming  Season  Main Actor
0  Stranger Things       Netflix       3      Millie
1      The X-Files            Fx      12     Gillian
2      Mandalorian   Disney Plus       1       Pedro
3         The Boys  Amazon Prime       2  Karl Urban

In this example, we have defined a DataFrame and then use the df.copy() method to copy the DataFrame and print both original and copied DataFrame. We did not pass any parameter to the copy() method.

Shallow copy and Deep Copy in Pandas DataFrame

To create deep copy of Pandas DataFrame, use df.copy() or df.copy(deep=True) method.

To create a shallow copy of Pandas DataFrame, use the df.copy(deep=False) method.

Pandas DataFrame copy() function makes a copy of this object’s indices and data. When deep=True (default), the new object will be created with a copy of the calling object’s data and indices.

Changes to the data or indices of the copy will not be flashed in the original object. When deep=False, the new object will be created without copying the calling object’s data or index (only references to the data and Index are copied). Any modifications to the data of the original will be reflected in the shallow copy.

See the following code.

# app.py

import pandas as pd

data = {'Show': ['Stranger Things', 'The X-Files', 'Mandalorian', 'The Boys'],
        'Streaming': ['Netflix', 'Fx', 'Disney Plus', 'Amazon Prime'],
        'Season': [3, 12, 1, 2],
        'Main Actor': ['Millie', 'Gillian', 'Padro', 'Karl Urban']}
df = pd.DataFrame.from_dict(data)
deepCopy = df.copy()
shallowCopy = df.copy(deep=False)

print('The df is equal to shallowCopy: ', df is shallowCopy)
print('The df is equal to deepCopy: ', df is deepCopy)

print('----------------------------------------------------')

print('The shallowCopy.values is equal to df.values: ',
      shallowCopy.values is df.values)
print('The deepCopy.values is equal to df.values: ', deepCopy.values is df.values)

print('----------------------------------------------------')

print('The shallowCopy.index is equal to df.index: ',
      shallowCopy.index is df.index)
print('The deepCopy.index is equal to df.index: ', deepCopy.index is df.index)

Output

The df is equal to shallowCopy:  False
The df is equal to deepCopy:  False
----------------------------------------------------
The shallowCopy.values is equal to df.values:  False
The deepCopy.values is equal to df.values:  False
----------------------------------------------------
The shallowCopy.index is equal to df.index:  True
The deepCopy.index is equal to df.index:  False

From the output, you can derive the following observations.

  1. Shallow copy shares data and Index with the original.
  2. A deep copy does not share the data and Index with the original object. The deep copy object has its copy of data and Index.
  3. When copying the object containing Python objects, the deep copy will copy the data, but it will not do so recursively. Updating the nested data object will be reflected in the deep copy.
  4. When deep=True, data is copied, but actual Python objects will not be copied recursively, only the reference to that object.
  5. While index objects are copied when deep=True, an underlying numpy array is not copied for performance reasons. Since the Index is immutable, the underlying data can be safely shared, and the copy is not needed.

That is it for the Pandas DataFrame copy() method.

See also

Pandas DataFrame drop()

Pandas DataFrame drop_duplicates()

Pandas drop column()

Pandas where()

Pandas mean()

Leave A Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.