dfply.join

Module Contents

dfply.join.get_join_parameters(join_kwargs)

Convenience function to determine the columns to join the right and left DataFrames on, as well as any suffixes for the columns.

dfply.join.inner_join(df, other, **kwargs)

Joins on values present in both DataFrames.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
suffixes (list): String suffixes to append to column names in left
and right DataFrames.
Example:

a >> inner_join(b, by=’x1’)

x1 x2 x3

0 A 1 True 1 B 2 False

dfply.join.full_join(df, other, **kwargs)

Joins on values present in either DataFrame. (Alternate to outer_join)

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
suffixes (list): String suffixes to append to column names in left
and right DataFrames.
Example:

a >> outer_join(b, by=’x1’)

x1 x2 x3

0 A 1.0 True 1 B 2.0 False 2 C 3.0 NaN 3 D NaN True

dfply.join.outer_join(df, other, **kwargs)

Joins on values present in either DataFrame. (Alternate to full_join)

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
suffixes (list): String suffixes to append to column names in left
and right DataFrames.
Example:

a >> full_join(b, by=’x1’)

x1 x2 x3

0 A 1.0 True 1 B 2.0 False 2 C 3.0 NaN 3 D NaN True

dfply.join.left_join(df, other, **kwargs)

Joins on values present in in the left DataFrame.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
suffixes (list): String suffixes to append to column names in left
and right DataFrames.
Example:

a >> left_join(b, by=’x1’)

x1 x2 x3

0 A 1 True 1 B 2 False 2 C 3 NaN

dfply.join.right_join(df, other, **kwargs)

Joins on values present in in the right DataFrame.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
suffixes (list): String suffixes to append to column names in left
and right DataFrames.
Example:

a >> right_join(b, by=’x1’)

x1 x2 x3

0 A 1.0 True 1 B 2.0 False 2 D NaN True

dfply.join.semi_join(df, other, **kwargs)

Returns all of the rows in the left DataFrame that have a match in the right DataFrame.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
Example:

a >> semi_join(b, by=’x1’)

x1 x2

0 A 1 1 B 2

dfply.join.anti_join(df, other, **kwargs)

Returns all of the rows in the left DataFrame that do not have a match in the right DataFrame.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe) other (pandas.DataFrame): Right DataFrame
Kwargs:
by (str or list): Columns to join on. If a single string, will join
on that column. If a list of lists which contain strings or integers, the right/left columns to join on.
Example:

a >> anti_join(b, by=’x1’)

x1 x2

2 C 3

dfply.join.bind_rows(df, other, join='outer', ignore_index=False)

Binds DataFrames “vertically”, stacking them together. This is equivalent to pd.concat with axis=0.

Args:
df (pandas.DataFrame): Top DataFrame (passed in via pipe). other (pandas.DataFrame): Bottom DataFrame.
Kwargs:
join (str): One of “outer” or “inner”. Outer join will preserve
columns not present in both DataFrames, whereas inner joining will drop them.
ignore_index (bool): Indicates whether to consider pandas indices as
part of the concatenation (defaults to False).
dfply.join.bind_cols(df, other, join='outer', ignore_index=False)

Binds DataFrames “horizontally”. This is equivalent to pd.concat with axis=1.

Args:
df (pandas.DataFrame): Left DataFrame (passed in via pipe). other (pandas.DataFrame): Right DataFrame.
Kwargs:
join (str): One of “outer” or “inner”. Outer join will preserve
rows not present in both DataFrames, whereas inner joining will drop them.
ignore_index (bool): Indicates whether to consider pandas indices as
part of the concatenation (defaults to False).