While I generally begin to look at web analytics data at a weekly or monthly level, there are times when it is useful to drill down to daily numbers. This can be when examining the reason for a change in the data or simply to review the previous day’s performance. But an issue arises which can make it difficult to interpret and extract useful insights from this daily data.
Most metrics, when viewed at daily level, contain a form of daily seasonality. This is most clear in metrics such as visits, page views or sales which are absolute numbers. There is a re-occuring pattern throughout the week with peaks and troughs on the same day/s each week. An example of this pattern can be seen in Figure 1 below.
While this makes any chart pretty to look at, it makes it difficult to really identify trends or spikes in the data. Is a data point high because there was a spike or because it was a Monday? It is school holidays but should the number of visits on that Sat really be that low? And of course, what day did we start to see traffic decline from and how much of a change is it really?
A common method used to remove daily seasonality is to smooth the line out using a moving average. As it is a weekly pattern, a seven point moving average should lead to a nice smooth line. Unfortunately, as can be seen in Figure 2, this means you get a nice smooth line, hiding most of those interesting spikes and step changes and general data trends. You can see overall trends but you cannot pinpoint particular days when a change occurred. It is also difficult to clearly identify a change immediately, as each day only contributes one seventh to each data point.
What I advise doing instead is to remove the daily seasonality from each data point, resulting in a line that is unaffected by what day of the week it is. Using this method means that it is clear to see if the performance each day was good or bad. For example, in Figure 3, it can be seen that the relatively worst day for visits was actually the 25th Aug, even though visits for that day were higher than for other days during the reported period. The technique for removing daily seasonality can be applied each day, meaning that you can identify and react to a change in performance immediately.
The difficulty then is in calculating the daily seasonality across a week. This can be done properly using SPSS or a similar tool but I use a quick hack workaround in Excel that, while not 100% accurate, gets the job done. The steps to calculate daily seasonality for a metric (using the examples of visits) are as follows, with the example displayed in Figure 4:
- Extract historical daily visits data. You will need at least 6 weeks, more if the period includes a known number of factors that could impact on traffic e.g. school holidays, public holidays, product releases, marketing campaigns, etc.
- Reorder the data so that each column contains a single week and each row contains only data for a particular day of the week.
- Recreate this table so but replace the visits for each day with the % that visits for that day contributed to total visits for that week.
- Add two more columns to calculate the mean and median for each row of data.
- Delete all weeks which contain days which don’t reflect the general pattern. In this example, weeks 5 and 6 were deleted. At this point, the mean and the median should be relatively similar for each day of the week.
- The daily seasonality pattern is achieved by multiplying the daily mean by 7.
This daily seasonality pattern can then be used for removing daily seasonality for that metric for any day. Simply divide the value for each day by the relevant daily seasonality in order to remove it. I generally do this using a vlookup against the day of the week for each date.
Going back to the reason for web analytics, you can use this technique to clean data so that you can instantly identify good and bad days, whether this is historical data or just for the preceding day. If you are using this for historical data, you can identify the interesting days to investigate further (play with by segmenting). If you are using on an on-going basis, you can see instantly what performance was like for the previous day and if need be, investigate and react to a change accordingly.
Currently, in order to be able to do this sort of analysis, you need to extract the data into Excel. Hopefully one day, web analytics tools will allows you to upload a daily seasonality pattern for a metric so that you can display the daily data with this seasonality removed. And my dream is of a tool that would incorporate the ability to automatically create the pattern for any selected metric (with manual over rides for tweaking of course).
The other key use that I have found for a daily seasonality pattern is it can be used in forecasting daily traffic levels. If you are able to forecast what the week’s traffic should be, this can easily be multiplied out using the daily seasonality pattern to forecast traffic at a daily level.
A copy of the Excel file containing all the data, charts and formulae used in the examples above can be downloaded here – Daily Seasonality File.
This post was originally published on AussieWebAnalyst on 26th Nov ’08