Our introductory survey of FEO best practice continues by outlining a standardised approach to monitoring and analysis.
Other titles in this blog series are:
- FEO – reports of its death have been much exaggerated – 22 February 2016 [also published in APM Digest]
- Introduction to FEO – Tooling Part 1 – 1 March 2016
- Introduction to FEO – Tooling Part 2 – 8 March 2016
- Introduction to FEO – Operations – Process – 15 March 2016
- Introduction to FEO – Granular Analysis Part 1 – 22 March 2016
- Introduction to FEO – Granular analysis Part 2 -29 March 2016
- Introduction to FEO – Granular analysis Part 3 – 5 April 2016
- Introduction to FEO – Edge Case management – 12 April 2016
- Introduction to FEO – Summary, bibliography for further study – 12 April 2016
Process – evidence from Monitoring:
Having considered the types of tools available to support external monitoring, this post continues the ‘FEO toe dip’ by outlining a structured process to support understanding and intervention in this important area.
At high level, a logical Front End Optimisation [FEO] process seeks to progressively move from general to specific understanding. The following are the key stages:
- Target definition, test specification
- External performance (response) and patterns
- ‘Performance monetisation goal’ – what is the optimal performance/investment that will just meet business goals
- Distribution of time between front end processing backend and third party components – how much FEO is going to be required?
- Detailed, granular client side analysis [covered in a future post]
- KPI definition; ongoing monitoring (best case external + end user)
A detailed report should consider many aspects of performance, both outturn (that is, the recorded response of the site or application in known test conditions), together with the underlying performance of relevant contributory factors. The table below is an extract from an analysis report on a major corporate site. It illustrates some of the factors considered
RAG summary – FEO report
Before embarking on any actual analysis, it is worth pausing to define the targets for the testing. Such target definition is likely to collate information from multiple sources, including:
- Knowledge of the key user touchpoints, for example landing pages, product category pages, shopping basket. In thinking about this, a useful guide is to “follow the money”, in other words to track key revenue generating paths/activities.
- Information derived from web analytics. This useful source will identify key transaction flows (any unusual patterns from ‘theoretical’ expectations – may reflect design or performance issues?). Areas of the site associated with unexplained negative behaviours should be included. Examples: transaction steps associated with high abandonment, high discrepancies between destination traffic (eg Search Engine derived) and high bounce rate.
If available, user click pattern ‘heat maps’ can also be a useful supplementary.
Visitor interaction – click pattern ‘heatmap’ – standalone example [www.crazyegg.com]
- ‘Folk knowledge’ – internal users, ‘friends and family’, customer services, CEO’s golfing partners…..
- Other Visitor based analysis (eg Real User Monitoring) in particular key markets, devices, operating systems, screen resolutions, connectivity distributions. The latter is particularly useful if supported by your APM tool.
One note of caution. ‘Raw’ visitor derived data (ie derived from the field, not usability lab.) is (obviously) the outcome of actual experience rather than objective controlled test conditions. For example, a low proportion of low specification mobile devices in the user stats may just be a reflection of user demographics, or they may reflect user satisfaction issues. This is where validation of RUM inferences using synthetic testing is particularly useful. ‘Why might this not be true?’ is a useful mindset for interpretation.
- Marketing/Line of Business input – who are the key competitors (by market), can anything be learnt from digital revenue data?
This will lead to a definition of the test parameters. Although the more core and edge case conditions tested the better the overall understanding, as with everything, these will be limited in practice by time and money.
Example test matrix:
|PC Browser(s), versions||Edge, Chrome 48, IE 10,11 FF 34, 44|
|Mobile device (web)||Samsung Galaxy 5 and 6, IPhone 6…|
|Mobile device, O/S (Mobile Apps)||Samsung Galaxy 6 Android|
|Mobile App details, source||Xyz.apk, Google Play|
|Connection bandwidth range (hardwired & wireless)||0.25 – 2.0 Mbps|
|Target ISPs and wireless carriers (by market)||[UK, hardwired] BT, Telstra
[UK wireless] EE, Vodafone
|Key geography(ies)||UK; S Spain (Malaga), Hong Kong|
|Competitor sites (/target details), by market|
|Other specific factors (eg user details associated with complaints)|
Armed with the target specifications for testing, it is useful to begin by monitoring the outturn performance of the site/application. Such monitoring should be representative of a broad range of user conditions, and should identify patterns of response behaviour across a relatively extended period of use (perhaps 2 weeks for ‘business as usual’ data, together with peak events as a comparison). Bear in mind that the details of the clientside test conditions are likely to markedly affect the data. Some examples:
3rd Party-associated cross browser variation – PC example
Mobile response testing – limiting bandwidth connection speed vs page response
Payload variation -Samsung smartphone, wireless carrier, 1-1.5 Mbps
Variation in page size (example above) should always be investigated. Causes may be internal (eg differential compression settings across origin servers), or external (eg differences in handling by CDN provider or wireless carriers. Appropriately designed follow-up testing will isolate the primary cause, providing that the tooling used offers appropriate flexibility.
Such preliminary monitoring enables us to understand what we are ‘up against’ from a Front End Optimisation perspective, and ultimately whether we are looking at fine tuning or wholesale interventions. Use of APM tooling can be particularly useful at this initial stage, both in understanding the relative proportion of delivery time associated with client side vs back end processing (example below), and in isolating/excluding any issues associated with delivery infrastructure or third party web services calls. However, as the external monitoring extensions to APM tools are still evolving in functionality (particularly in relation to synthetic testing), additional tools will probably be preferred for FEO monitoring, due to the better control of test conditions and/or granular analysis offered by more mature products.
Client Side vs Server Side processing time [dynaTrace (AJAX Edition)]
In capturing this baseline data, it is important to compare both consistent and end user (inherently variable) conditions. Ideally, both visitor based [RUM] and Synthetic-based data should be used. This will give useful information regarding the performance of all components an all traffic conditions. As mentioned in Blog 3, if it is possible to introduce common ‘above the fold’ (perceived render time) endpoints as custom markers in both types of test, that will assist in reading across between the test types. Such modifications provide a more realistic understanding of actual end user response, although would be somewhat cumbersome to implement across a wide range of screen resolutions.
Sub page level performance. Depending upon the detailed characteristics of the target sites, it is often useful to run several comparative monitor tests. Some specific cases (eg Single Page Applications, server push content) will be covered in a later blog. It is often useful to understand the impact of particular components on overall response. This can be achieved in a number of ways, but two of the most straightforward are to test for a SPOF or single point of failure – ie the effect of failure of particular (often third party) site content, or to remove content altogether. Techniques for achieving this will depend on the particular tool being used. See Viscomi et al’s Using WebPageTest (O’Reilly, 2016) for details in relation to that particular tool. The same intervention can be made in most synthetic tools (with more or less elegance) using the relevant scripting language/utility.
‘Above the fold’ end point – custom insertion of flag image – synthetic testing
Use of different test end points can have significant effects on reported results / interpretation. The table below illustrates the variation to a single target page within a major UK eCommerce site:
Selective filtering of content can also be used to examine the effect of particular calls on aggregate delivery metrics such as DNS resolution or content delivery times.
The following is a standard monitoring matrix that we typically use for preliminary screening of external performance. The results are used to inform and direct detailed granular analysis, and overview of which will be covered in our next blog in this series.
FEO preliminary screening process – example:
- ‘Dynamic’ performance – page onload and perceived render [‘Above the Fold’]
- 24×7 Availability and response patterns – synthetic ISP (by market)
- 24×7 Availability & response – end user by market (synthetic & RUM)
- Target browser/screen resolution & device
- Any cross browser/device discrepancies?
- Defined connectivity – hardwired & public wireless carrier
- Target browser/screen resolution & device
- Page response distribution
- Histogram of response ranges
- Median Response & distribution (Median Absolute Dispersion)
- Weekly business hours
- Day vs Night (variation with traffic)
- Cached vs Uncached
- By key market / user category
- Performance monetisation (tool dependent, examples):
- Page/Transaction response vs shopping cart conversion
- Page/transaction response vs abandonment
- Page/transaction response vs mean basket size
- Page/transaction response vs digital revenue (defined time period)
- Page response vs bounce or exit rate
- Competitive comparison – direct and mass market sites
- Page and ‘revenue bearing transaction’ (eg search & Add to Basket)
- Limiting bandwidth tests
- Response to progressively reducing connectivity conditions
- Wi-Fi & public carrier
- Transaction step comparison
- Where are the slowest steps (& why – eg database lookup)
- ‘Payload’ analysis
- Page download size patterns
- ‘Affiliate load’ – 3rd party effects
- Filter & SPOF testing
- ‘Real device’ mobile testing
- Component splits/patterns
- DNS/SSL resolution
- First byte time delivery (infrastructure latency)
- Content delivery
- CDN performance assurance
- Origin vs local cache comparison
- Detailed ‘static’ component analysis
- Response to progressively reducing connectivity conditions
The above 13-point checklist will provide a picture of the revenue-relevant behaviour of the site/application. This is not exhaustive – it is necessary to be led by the findings from case to case – but it supports targeting for the more granular ‘static’ component-level analysis which provides the root cause / business justification basis for specific remediation interventions. Some approaches to detailed, granular analysis are covered in the next post of this series.