dfply.window_functions¶
Module Contents¶
-
dfply.window_functions.lead(series, i=1)¶ Returns a series shifted forward by a value. NaN values will be filled in the end.
Same as a call to series.shift(i)
- Args:
- series: column to shift forward. i (int): number of positions to shift forward.
-
dfply.window_functions.lag(series, i=1)¶ Returns a series shifted backwards by a value. NaN values will be filled in the beginning.
Same as a call to series.shift(-i)
- Args:
- series: column to shift backward. i (int): number of positions to shift backward.
-
dfply.window_functions.between(series, a, b, inclusive=False)¶ Returns a boolean series specifying whether rows of the input series are between values a and b.
- Args:
series: column to compare, typically symbolic. a: value series must be greater than (or equal to if inclusive=True)
for the output series to be True at that position.- b: value series must be less than (or equal to if inclusive=True) for
- the output series to be True at that position.
- Kwargs:
- inclusive (bool): If True, comparison is done with >= and <=.
- If False (the default), comparison uses > and <.
-
dfply.window_functions.dense_rank(series, ascending=True)¶ Equivalent to series.rank(method=’dense’, ascending=ascending).
- Args:
- series: column to rank.
- Kwargs:
- ascending (bool): whether to rank in ascending order (default is True).
-
dfply.window_functions.min_rank(series, ascending=True)¶ Equivalent to series.rank(method=’min’, ascending=ascending).
- Args:
- series: column to rank.
- Kwargs:
- ascending (bool): whether to rank in ascending order (default is True).
-
dfply.window_functions.cumsum(series)¶ Calculates cumulative sum of values. Equivalent to series.cumsum().
- Args:
- series: column to compute cumulative sum for.
-
dfply.window_functions.cummean(series)¶ Calculates cumulative mean of values. Equivalent to series.expanding().mean().
- Args:
- series: column to compute cumulative mean for.
-
dfply.window_functions.cummax(series)¶ Calculates cumulative maximum of values. Equivalent to series.expanding().max().
- Args:
- series: column to compute cumulative maximum for.
-
dfply.window_functions.cummin(series)¶ Calculates cumulative minimum of values. Equivalent to series.expanding().min().
- Args:
- series: column to compute cumulative minimum for.
-
dfply.window_functions.cumprod(series)¶ Calculates cumulative product of values. Equivalent to series.cumprod().
- Args:
- series: column to compute cumulative product for.
-
dfply.window_functions.cumany(series)¶ Calculates cumulative any of values. Equivalent to series.expanding().apply(np.any).astype(bool).
- Args:
- series: column to compute cumulative any for.
-
dfply.window_functions.cumall(series)¶ Calculates cumulative all of values. Equivalent to series.expanding().apply(np.all).astype(bool).
- Args:
- series: column to compute cumulative all for.
-
dfply.window_functions.percent_rank(series, ascending=True)¶
-
dfply.window_functions.row_number(series, ascending=True)¶ Returns row number based on column rank Equivalent to series.rank(method=’first’, ascending=ascending).
- Args:
- series: column to rank.
- Kwargs:
- ascending (bool): whether to rank in ascending order (default is True).
Usage: diamonds >> head() >> mutate(rn=row_number(X.x))
carat cut color clarity depth table price x y z rn0 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43 2.0 1 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31 1.0 2 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31 3.0 3 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63 4.0 4 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75 5.0