Open In App

Difference between merge() and concat() in Pandas

Last Updated : 25 Nov, 2024
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

When working with data in pandas, combining multiple DataFrames is a common task. Two primary methods for this are merge() and concat(). While both functions help combine DataFrames, they serve different purposes and operate in distinct ways.

Merge function is used to merge two dataframes based on the common columns or indexes and use concat() when you want to stack DataFrames either vertically or horizontally without needing a key. Below is a sample example that illustrates the use of merge and concat.

Python
import pandas as pd

df1 = pd.DataFrame({'ID': [1, 2, 3],'Name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'ID': [2, 3, 4],'Age': [25, 30, 35]})

# Merge dataframes on 'ID'
merged_df = pd.merge(df1, df2, on='ID', how='inner')
print("Merged DataFrame (using merge):")
print(merged_df)

# Concatenating the two dataframes along columns (axis=1)
concat_df = pd.concat([df1, df2], axis=1)
print("\nConcatenated DataFrame (using concat):")
print(concat_df)

Output:

Screenshot-2024-11-18-183655
merge() and concat() in pandas

In merge, we can see that based on ID we have merged two dataframes. When we are using concat operation, we are not using any condition and just placing the tables side by side.

When to use merge and concat?

These two functions totally serve different purpose. Merge should be used when we need the rows that satisfy the criteria's. In simple terms, Merge basically filters the data. Concat on the other hand is used when we need to add more data. Using concat we can add rows or columns as per our requirements.

Characteristics

merge

concat

Use

Used to join dataframes based on common columns.

Used to combine dataframes along any axis.

Common keys

Merge makes use of common columns to join the dataframes

Concat does not make use of common columns

Joins

Merge supports four types of joins: inner, outer, left and right.

Concat supports only two types of joins: outer and inner.

Axis

Does not make use of any axis parameter

Makes use of axis parameter. Axis=0 denotes rows and Axis=1 denotes columns.

on parameter

Makes use of on parameter. In this the common column names are passed.

Does not make use of on parameter.

1. Using merge function on common columns

Merge function is used to merge two or more dataframes based on common columns which basically acts like conditions. It can be compared to the joins in SQL. Here we can perform 4 types of joins or merge operations. They are as follows:

  • Inner: This basically returns the rows that have matching values in both the dataframes.
  • Outer: This basically returns all the rows. For those rows whose particular value does not exists, it returns NaN value.
  • Left: It returns all the rows from the left table or the first table and the matching rows from the second table.
  • Right: It is similar to the left join. It returns all the rows from the second table and matching rows from the first table.

By default merge function joins on the basis of inner join.

Syntax:

pd.merge(df1, df2, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, suffixes=('_x', '_y'))

  • df1 is the first table.
  • df2 is the second table.
  • how basically defines the type of join. It can be inner, outer, left or right.
  • on denotes the common columns.
  • left_on and right_on is used if the common columns have different column names.
  • left_index and right_index is used when index can be used as common column to merge the tables.
  • suffixes is used for other overlapping common column names.

Below is an example that makes use of merge operation.

Python
import pandas as pd

df1 = pd.DataFrame({'Roll Number': [101, 102, 103],'Name': ['Alice', 'Bob', 'Charlie'],'Marks': [85, 92, 78]})
df2 = pd.DataFrame({'Roll Number': [101, 102, 103],'Subject': ['Math', 'Science', 'English']})

# Merging the two DataFrames on 'Roll Number'
merged_df = pd.merge(df1, df2, on='Roll Number', how='inner')
print(merged_df)

Output:

Screenshot-2024-11-19-193151
Difference(s) between merge() and concat() in pandas

2. Using concat function to combine along any axis

Concat is used to combine two or more dataframes along column or row manner. In simple terms we can say that we can stack either horizontally or vertically. Unlike merge function, concat does not need any common column and is mainly used when we need to add more data to our dataframes.

Syntax

pd.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)

  • objs represents the list of dataframes.
  • axis denotes row wise concatenation or column wise concatenation. 0 denotes row and 1 denotes column
  • join can be either inner or outer
  • keys is used to keep track of the dataframes in hierarchical format.
  • levels is used to specify custom indexes.
  • Name is used to define name for the hierarchical indexes.
  • verify_integrity ensures that the indexes are unique.
  • sort is used to sort the columns.
  • copy is used to keep the copy of data.

Below is an example that makes use of concat operation.

Python
import pandas as pd

df1 = pd.DataFrame({'Roll Number': [101, 102, 103],'Name': ['Alice', 'Bob', 'Charlie'],'Marks': [85, 92, 78]})
df2 = pd.DataFrame({'Roll Number': [104, 105, 106],'Name': ['David', 'Eve', 'Frank'],'Marks': [88, 95, 82]})

# Concatenating the two DataFrames
concatenated_df = pd.concat([df1, df2], ignore_index=True)
print(concatenated_df)

Output:

Screenshot-2024-11-18-184320
Difference(s) between merge() and concat() in pandas

As we can see that the two tables have been stacked on over another. Although there are common columns, but it in concat it does not make use of common columns.

Difference(s) between merge() and concat() in pandas - Examples

Here in this we have implemented the use of Merge and concat in the dataframes.

Example 1: In this example we have two dataframes. One dataframe comprises of fruit names along with other details and the second dataframe comprises of fruit name, origin etc. We will use merge function and inner join to combine two dataframes.

Python
import pandas as pd

# Creating the DataFrames
df1 = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Cherry'],
    'Color': ['Red', 'Yellow', 'Red'],
    'Origin': ['USA', 'Ecuador', 'Turkey']
})

df2 = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Mango'],
    'Price': [1.2, 0.5, 1.5],
    'Store': ['Walmart', 'Costco', 'Trader Joe\'s']
})
print('-----merge-----')
# Merge based on 'Fruit' column (inner join by default)
merged_df = pd.merge(df1, df2, on='Fruit', how='inner')
print(merged_df)

Output:

Screenshot-2024-11-19-194735
Difference(s) between merge() and concat() in pandas

Example 2: Here we are calculating the net sales in North and South region. For this we will use concat and join them row wise.

Python
import pandas as pd

# Creating the DataFrames
df1 = pd.DataFrame({
    'Region': ['North', 'North'],
    'Product': ['Apples', 'Bananas'],
    'Sales': [120, 80]
})

df2 = pd.DataFrame({
    'Region': ['South', 'South'],
    'Product': ['Apples', 'Oranges'],
    'Sales': [100, 150]
})

# Concatenating rows
result = pd.concat([df1, df2], ignore_index=True)
print(result)

Output:

Screenshot-2024-11-19-195039
Difference(s) between merge() and concat() in pandas

FAQs on differences between merge() and concat() in pandas

Explain the two functions merge and concat?

Merge is used when we want to filter rows based on common column values. Concat is used when we need to combine dataframes either horizontally or vertically.

Explain what does axis=0 and axis=1 mean in concat?

Axis=0 means adding rows or joining two dataframes vertically. Axis=1 means adding columns or joining two dataframes horizontally.

Can we perform left and right join in concat?

No in concat we can perform two types of joins: inner and outer. By default outer join is passed as value in concat.


Next Article

Similar Reads

three90RightbarBannerImg
  翻译: