Title: | 'stats' for Seasonal Adjustment on the Fly with 'ggplot2' |
---|---|
Description: | Provides 'ggplot2' 'stats' that estimate seasonally adjusted series and rolling summaries such as rolling average on the fly for time series. |
Authors: | Peter Ellis [aut, cre], Christophe Sax [ctb] |
Maintainer: | Peter Ellis <[email protected]> |
License: | GPL-3 |
Version: | 0.5.4 |
Built: | 2024-10-31 05:36:30 UTC |
Source: | https://github.com/ellisp/ggseas |
ggseas allows you to perform seasonal decomposition on the fly as part of a 'ggplot2' pipeline.
Two main sets of functions are provided:
stat_seas()
, stat_stl()
and friends do seasonal adjustment, indexing, rolling averages and by default render a line geom.
You can consider these as taking the place of geom_line()
in the ggplot2 pipeline
ggsdc()
goes where ggplot()
normally does (ie at the beginning of the graphics part of the pipeline) and creates a graphic with four facets for the original data, trend, seasonal and random components
Maintainer: Peter Ellis [email protected]
Other contributors:
Christophe Sax [contributor]
Useful links:
Report bugs at https://github.com/ellisp/ggseas/issues
Creates a four-facet plot of seasonal decomposition showing observed, trend, seasonal and random components
ggsdc(data, mapping, frequency = NULL, method = c("stl", "decompose", "seas"), start = NULL, s.window, type = c("additive", "multiplicative"), index.ref = NULL, index.basis = 100, facet.titles = c("observed", "trend", "seasonal", "irregular"))
ggsdc(data, mapping, frequency = NULL, method = c("stl", "decompose", "seas"), start = NULL, s.window, type = c("additive", "multiplicative"), index.ref = NULL, index.basis = 100, facet.titles = c("observed", "trend", "seasonal", "irregular"))
data |
dataset to use for plot. |
mapping |
List of aesthetic mappings. Must include x and y, and optionally can include colour/color |
frequency |
frequency of the period of the time series eg 12 = monthly |
method |
function to use for performing the seasonal decomposition. stl
and decompose are functions in the |
start |
starting time for the data; only needed if |
s.window |
parameter to pass to |
type |
parameter to pass to |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
facet.titles |
a vector in the order of |
This function takes a data frame and performs seasonal decomposition on the variable mapped to the y aesthetic, grouped by the variable (if any) mapped to the colour or color aesthetic. This allows the user to perform the equivalent of plot(stats::decompose(x)) but in the ggplot2 environment for themes, polishing etc; and to overlay decompositions on the same graphic; and with the X13-SEATS-ARIMA seasonal decomposition (so far only with default settings).
The "seasonal" component can be either multiplicative (in which case it will in a small range of values around one) or additive (in which case it will be on the scale of the original data), depending on the settings.
an object of class ggplot with four facets
# sample time series data in data frame ap_df <- tsdf(AirPassengers) ggsdc(ap_df, aes(x = x, y = y), method = "decompose") + geom_line() ggsdc(ap_df, aes(x = x, y = y), method = "decompose", type = "multiplicative") + geom_line(colour = "blue", size = 2) + theme_light(8) ggsdc(ap_df, aes(x = x, y = y), method = "stl", s.window = 7) + labs(x = "", y = "Air passenger numbers") + geom_point() ## Not run: ggsdc(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex), method = "seas") + geom_line() serv <- subset(nzbop, Account == "Current account" & Category %in% c("Services; Exports total", "Services; Imports total")) ggsdc(serv, aes(x = TimePeriod, y = Value, colour = Category), method = "seas", start = c(1971, 2), frequency = 4) + geom_line() ## End(Not run) ggsdc(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex), s.window = 7, index.ref = 1:12, index.basis = 1000) + geom_line() + ylab("Lung deaths index (average month in 1974 = 1000)") bop <- subset(nzbop, Account == "Current account" & !Balance) ggsdc(bop, aes(x = TimePeriod, y = Value, colour = Category), frequency = 4, method = "decomp", type = "multiplicative") + geom_line() ggsdc(bop, aes(x = TimePeriod, y = Value, colour = Category), frequency = 4, s.window = 7) + geom_line()
# sample time series data in data frame ap_df <- tsdf(AirPassengers) ggsdc(ap_df, aes(x = x, y = y), method = "decompose") + geom_line() ggsdc(ap_df, aes(x = x, y = y), method = "decompose", type = "multiplicative") + geom_line(colour = "blue", size = 2) + theme_light(8) ggsdc(ap_df, aes(x = x, y = y), method = "stl", s.window = 7) + labs(x = "", y = "Air passenger numbers") + geom_point() ## Not run: ggsdc(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex), method = "seas") + geom_line() serv <- subset(nzbop, Account == "Current account" & Category %in% c("Services; Exports total", "Services; Imports total")) ggsdc(serv, aes(x = TimePeriod, y = Value, colour = Category), method = "seas", start = c(1971, 2), frequency = 4) + geom_line() ## End(Not run) ggsdc(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex), s.window = 7, index.ref = 1:12, index.basis = 1000) + geom_line() + ylab("Lung deaths index (average month in 1974 = 1000)") bop <- subset(nzbop, Account == "Current account" & !Balance) ggsdc(bop, aes(x = TimePeriod, y = Value, colour = Category), frequency = 4, method = "decomp", type = "multiplicative") + geom_line() ggsdc(bop, aes(x = TimePeriod, y = Value, colour = Category), frequency = 4, s.window = 7) + geom_line()
A long form combination of fdeaths
and mdeaths
from the datasets
package.
ldeaths_df
ldeaths_df
A data frame with 141 rows and 3 variables.
YearMon. Approximate, regular decimal representation of the beginning of the period of measurement. January 1974 is 1974.000
sex.
deaths. Monthly deaths from bronchitis, emphysema and asthma. ...
P. J. Diggle (1990) Time Series: A Biostatistical Introduction. Oxford, table A.3
New Zealand's "BPM6 Quarterly, Balance of payments major components (Qrtly-Mar/Jun/Sep/Dec)".
nzbop
nzbop
An object of class data.frame
with 3676 rows and 5 columns.
"BPM6" refers to the sixth edition of the IMF's Balance of Payments and International Investment Position Manual, which is the method used by Statistics New Zealand to prepare these data.
Note:
'Value' is in millions of New Zealand dollars and is not adjusted for inflation.
'fob' means 'free on board'.
'inv.' stands for investment
TimePeriod is the last day of the quarterly reference period ie 1971-06-30
means the fourth, fifth and six months of 1971.
This dataset was downloaded from http://www.stats.govt.nz/infoshare/ and transformed in the following way:
missing values were filtered out (ie of series that started later than the longest series)
a 'Balance' indicator variable was added for easier manipulation and filtering
the single variable categorisation was split into two (Account and Category) to make it tidier.
Statistics New Zealand http://www.stats.govt.nz/browse_for_stats/economic_indicators/balance_of_payments/info-releases.aspx
Conducts seasonal adjustment on the fly for ggplot2, from classical seasonal decomposition by moving averages
stat_decomp(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, frequency = NULL, type = c("additive", "multiplicative"), index.ref = NULL, index.basis = 100, ...)
stat_decomp(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, frequency = NULL, type = c("additive", "multiplicative"), index.ref = NULL, index.basis = 100, ...)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use display the data |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
frequency |
The frequency for the time series |
type |
The type of seasonal component |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
... |
other arguments for the geom |
Classical decomposition is a very basic way of performing seasonal
adjustment and is not recommended if you have access to X13-SEATS-ARIMA
(stat_seas
). stat_decomp
cannot allow the seasonality to vary
over time, or take outliers into account in calculating seasonality.
Other time series stats for ggplot2: stat_index
,
stat_rollapplyr
, stat_seas
,
stat_stl
ap_df <- tsdf(AirPassengers) # Default additive decomposition (doesn't work well in this case!): ggplot(ap_df, aes(x = x, y = y)) + stat_decomp() # Multiplicative decomposition, more appropriate: ggplot(ap_df, aes(x = x, y = y)) + stat_decomp(type = "multiplicative") # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_decomp() + ggtitle("Seasonally adjusted lung deaths") # Example using index: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + facet_wrap(~sex) + stat_decomp(index.ref = 1:12, index.basis = 1000) + ggtitle("Rolling annual median lung deaths, indexed (average month in 1974 = 1000)")
ap_df <- tsdf(AirPassengers) # Default additive decomposition (doesn't work well in this case!): ggplot(ap_df, aes(x = x, y = y)) + stat_decomp() # Multiplicative decomposition, more appropriate: ggplot(ap_df, aes(x = x, y = y)) + stat_decomp(type = "multiplicative") # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_decomp() + ggtitle("Seasonally adjusted lung deaths") # Example using index: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + facet_wrap(~sex) + stat_decomp(index.ref = 1:12, index.basis = 1000) + ggtitle("Rolling annual median lung deaths, indexed (average month in 1974 = 1000)")
Convert a time series from the original scale to an index for ggplot2
stat_index(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, index.ref = NULL, index.basis = 100, ...)
stat_index(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, index.ref = NULL, index.basis = 100, ...)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use display the data |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
... |
other arguments for the geom |
Other time series stats for ggplot2: stat_decomp
,
stat_rollapplyr
, stat_seas
,
stat_stl
ap_df <- tsdf(AirPassengers) ggplot(ldeaths_df, aes(x = YearMon, y = deaths, color = sex)) + stat_index(index.ref = 1:12, index.basis = 1000) + ylab("Deaths index\n(average of first 12 months = 1000")
ap_df <- tsdf(AirPassengers) ggplot(ldeaths_df, aes(x = YearMon, y = deaths, color = sex)) + stat_index(index.ref = 1:12, index.basis = 1000) + ylab("Deaths index\n(average of first 12 months = 1000")
Calculates a rolling summary, usually rolling average, on the fly for ggplot2
stat_rollapplyr(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, width, align = "right", FUN = mean, index.ref = NULL, index.basis = 100, ...)
stat_rollapplyr(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, width, align = "right", FUN = mean, index.ref = NULL, index.basis = 100, ...)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use display the data |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
width |
The width to which the rolling version of FUN is applied |
align |
specifies whether the transformed series should be left or right-aligned or centered compared to the rolling window of observations |
FUN |
summary function, usually some kind of average, to apply on a rolling basis |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
... |
other arguments for the geom |
Calculates a rolling summary (usually rolling average) on the fly for purposes of plotting with ggplot2.
Other time series stats for ggplot2: stat_decomp
,
stat_index
, stat_seas
,
stat_stl
ap_df <- tsdf(AirPassengers) ggplot(ap_df, aes(x = x, y = y)) + stat_rollapplyr(width = 12) # rolling average after converting to an index, 1000 = average value # in the first 12 months. ggplot(ap_df, aes(x = x, y = y)) + stat_rollapplyr(width = 12, index.ref = 1:12, index.basis = 1000) ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_rollapplyr(width = 12, FUN = median) + ggtitle("Seasonally adjusted lung deaths")
ap_df <- tsdf(AirPassengers) ggplot(ap_df, aes(x = x, y = y)) + stat_rollapplyr(width = 12) # rolling average after converting to an index, 1000 = average value # in the first 12 months. ggplot(ap_df, aes(x = x, y = y)) + stat_rollapplyr(width = 12, index.ref = 1:12, index.basis = 1000) ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_rollapplyr(width = 12, FUN = median) + ggtitle("Seasonally adjusted lung deaths")
Conducts X13-SEATS-ARIMA seasonal adjustment on the fly for ggplot2
stat_seas(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, x13_params = NULL, index.ref = NULL, index.basis = 100, frequency = NULL, start = NULL, ...)
stat_seas(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, x13_params = NULL, index.ref = NULL, index.basis = 100, frequency = NULL, start = NULL, ...)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use display the data |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
x13_params |
a list of other parameters for |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
frequency |
The frequency for the time series |
start |
The starting point for the time series, in a format suitable for |
... |
other arguments for the geom |
Other time series stats for ggplot2: stat_decomp
,
stat_index
, stat_rollapplyr
,
stat_stl
## Not run: ap_df <- tsdf(AirPassengers) # SEATS with defaults: ggplot(ap_df, aes(x = x, y = y)) + stat_seas() # X11 with no outlier treatment: ggplot(ap_df, aes(x = x, y = y)) + stat_seas(x13_params = list(x11 = "", outlier = NULL)) # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_seas() + ggtitle("Seasonally adjusted lung deaths") # example use of index: ggplot(ap_df, aes(x = x, y = y)) + stat_seas(x13_params = list(x11 = "", outlier = NULL), index.ref = 1, index.basis = 1000) + labs(y = "Seasonally adjusted index\n(first observation = 1000)") # if the x value is not a decimal eg not created with time(your_ts_object), # you need to specify start and frequency by hand: ggplot(subset(nzbop, Account == "Current account"), aes(x = TimePeriod, y = Value)) + stat_seas(start = c(1971, 2), frequency = 12) + facet_wrap(~Category, scales = "free_y") ## End(Not run)
## Not run: ap_df <- tsdf(AirPassengers) # SEATS with defaults: ggplot(ap_df, aes(x = x, y = y)) + stat_seas() # X11 with no outlier treatment: ggplot(ap_df, aes(x = x, y = y)) + stat_seas(x13_params = list(x11 = "", outlier = NULL)) # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_seas() + ggtitle("Seasonally adjusted lung deaths") # example use of index: ggplot(ap_df, aes(x = x, y = y)) + stat_seas(x13_params = list(x11 = "", outlier = NULL), index.ref = 1, index.basis = 1000) + labs(y = "Seasonally adjusted index\n(first observation = 1000)") # if the x value is not a decimal eg not created with time(your_ts_object), # you need to specify start and frequency by hand: ggplot(subset(nzbop, Account == "Current account"), aes(x = TimePeriod, y = Value)) + stat_seas(start = c(1971, 2), frequency = 12) + facet_wrap(~Category, scales = "free_y") ## End(Not run)
Conducts seasonal adjustment on the fly for ggplot2, from LOESS seasonal decomposition
stat_stl(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, frequency = NULL, s.window, index.ref = NULL, index.basis = 100, ...)
stat_stl(mapping = NULL, data = NULL, geom = "line", position = "identity", show.legend = NA, inherit.aes = TRUE, frequency = NULL, s.window, index.ref = NULL, index.basis = 100, ...)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use display the data |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
frequency |
The frequency for the time series |
s.window |
either the character string |
index.ref |
if not NULL, a vector of integers indicating which elements of the beginning of each series to use as a reference point for converting to an index. If NULL, no conversion takes place and the data are presented on the original scale. |
index.basis |
if index.ref is not NULL, the basis point for converting to an index, most commonly 100 or 1000. See examples. |
... |
other arguments for the geom |
Other time series stats for ggplot2: stat_decomp
,
stat_index
, stat_rollapplyr
,
stat_seas
ap_df <- tsdf(AirPassengers) # periodic if fixed seasonality; doesn't work well: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = "periodic") # seasonality varies a bit over time, works better: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = 7) # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_stl(s.window = 7) + ggtitle("Seasonally adjusted lung deaths") # Index so first value is 100: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = 7, index.ref = 1)
ap_df <- tsdf(AirPassengers) # periodic if fixed seasonality; doesn't work well: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = "periodic") # seasonality varies a bit over time, works better: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = 7) # Multiple time series example: ggplot(ldeaths_df, aes(x = YearMon, y = deaths, colour = sex)) + geom_point() + facet_wrap(~sex) + stat_stl(s.window = 7) + ggtitle("Seasonally adjusted lung deaths") # Index so first value is 100: ggplot(ap_df, aes(x = x, y = y)) + stat_stl(s.window = 7, index.ref = 1)
Convert a ts object to data.frame with columns for time period and the original data
tsdf(timeseries, colname = "x")
tsdf(timeseries, colname = "x")
timeseries |
an object of class ts or mts |
colname |
Column name to give to the time period column |
A convenience function to create a data frame from a time series or multiple time series object. The motivation is to make it easy to pass time series data to functions that need data frames such as ggplot2.
a data.frame with the same number of rows as the original time series
head(tsdf(AirPassengers)) ld <- cbind(fdeaths, mdeaths) head(tsdf(ld))
head(tsdf(AirPassengers)) ld <- cbind(fdeaths, mdeaths) head(tsdf(ld))