Testing & Results – My SportApp

In order to measure whether My SportApp is successful, a comprehensive evaluation must take place. This section of the report will critically evaluate My SportApp’s functionality against its requirements, usability, suitability and user response.

Evaluation Methodology

The following sections detail the methodology used for each type of testing and evaluation.

Testing functionality against requirements

The first stage of testing and evaluation that needed to be completed was testing My SportApp against the initial requirements. This was completed internally by going through each user requirement, ensuring that the action specified within the requirement could be completed.

For example, ‘Communicate with members via email’ was validated by ensuring that an email can be sent to members within the My SportApp. The results of each test were recorded in Section 5.2.1, Table 9.

Accessibility Testing

A certain portion of accessibility testing could be completed via usability testing (seeing if any of the users encountered problems with the size of objects, colours alongside other factors), but this was not conclusive enough. As such, third-party accessibility analytics tools or extensions were used.

After reviewing the available accessibility checkers, Siteimprove’s Accessibility Checker was selected. This tool runs as a browser extension, meaning that it can provide real-time feedback on the specific content on the current page.

Usability Testing

In order to conduct usability testing, a variety of methods were employed, which involved users from the intended user base. This test base consisted of 10 users. Initially, it was intended to have a greater number of participants, but this proved to be impractical for in-person testing. With 10 participants and 18 tasks, this means that 180 tasks were completed, in addition to additional testing and evaluation.

While user questioning was a useful evaluation method, such as “How easy did you find the system to use?”, there was too much room for subjective variance (Wickens, 1996). As a result, greater focus was placed on raw data that could be collected through testing, such as time taken, or errors made by end users of the proposed My SportApp system.

First, Task Completion Time Observations. TCT Observations were a useful indication of how easy My SportApp is to use – the shorter the time, the easier it is to use, and the better it is designed.

It was important that the tasks that were assigned to users were representative of tasks that a real user would complete using the system; otherwise, the results would have been irrelevant. As such, a comprehensive list of tests was constructed, as shown in Appendix 3: Testing Task List.

Users were asked to complete each of these tasks in sequence, with the testing supervisor timing how long each task took from start to completion. Once complete, these results were input into a spreadsheet for analysis.

Next, Error Rate Observations. This testing was conducted in parallel with TCT Observations to save time. A higher error rate could indicate a usability issue. While timing the users on each task, a tally of the number of errors was also made. Again, once testing was completed, this was entered into a spreadsheet for analysis.

System Usability Scale

After the completion of TCT and Error Rate observations, users were asked to complete a System Usability Scale form based on the industry-recognised questionnaire. This formed the subjective part of the testing process, providing insight into the thoughts of the users after they had just used the system. This, when supplemented by the raw data, helped to gain a clearer picture as to whether the better data was also indicative of a more positive user experience.

Results

The results of the aforementioned testing can be found in the following sections.

Testing functionality against requirements

The below table shows the results of the functionality testing:

TABLE GOES HERE

Accessibility Testing

In its normal configuration, My SportApp is accessible for the majority of users and provides good contrast, with well-spaced content. However, as discussed in Section 4.2.27, an Accessibility Mode is provided for users with visual impairments. The following results are based on My SportApp running in Accessibility Mode.

Figure 87, an Accessibility Checker report for My SportApp shows only two issues, confirming that the ‘Accessibility Mode’ within My SportApp serves its purpose of ensuring that the site is accessible for most people.

Figure 87: Siteimprove Accessibility Score for My SportApp

Usability Testing

The results of the usability testing can be found in the following subsections. These results are discussed in Section 5.3.

Task Completion Time

Error Rate Observations

Discussion

As shown in Table 9, all ‘Must-have’ requirements were met, with only one other requirement not being met. ‘Communicate via members via push notification’ was not implemented at this stage due to time and experience constraints. While there was a familiarity with SMS and Email APIs, push notifications were an area that had not previously been explored. As such, spending time to learn, understand and implement push notifications may have lead to other, more important, requirements not being completed. This is supported by the initial user feedback, as shown in Section 3.10, which indicated that push notifications were the least important communication method – as such, other methods were prioritised and achieved.

Next, when looking at accessibility testing, while it would be ideal to have a perfect score, this is not practical at this stage due to the wide-ranging accessibility specifications. As such, it was more meaningful to compare My SportApp with other similar systems, such as the SportEasy website (Figure 89). While the results of the My SportApp Testing (Figure 86) show My SportApp to have three deficiencies, we can see that SportEasy has ten issues, which is significantly more. Similarly, Figure 90 shows the same accessibility report for SportMember, another sports management platform which was selected for accessibility testing due to being a direct competitor for My SportApp. Again, SportMember has a much higher issue count than My SportApp, showing that My SportApp is significantly more compliant with accessibility specifications than its competitors.

Figure 90: Siteimprove Accessibility Checker report for SportEasy

Figure 91: Siteimprove Accessibility Checker report for SportMember

Moving onto the quantitative testing conducted via Task Completion Time, Error Rate Observations and System Usability Scales, Figure 92, below, is a chart that shows the results of Task Completion Times and Error Rates testing, as well as highlighting the correlation between both tests.

Figure 92: A chart showing the correlation between Task Completion Times and Error Rates (OpenAI, 2024)

From the graph, we can see that tasks show high variance in both average time to complete and average errors. While task duration at first glance suggests some major deficiencies (such as task 0 being significantly higher than others), the variance between tasks can be primarily attributed to the varying complexity of each task. Where some tasks, such as Task 7, could be completed from the page/view that the participant already had open, others, like Task 0 required the user to type lots of information on various screens – typing speed can have a significant impact on task completion time, whilst not being an indication of a deficiency in My SportApp.

On the whole, these results are a positive indication that My SportApp is well designed and has a good usability. This is evidenced by 50% of tasks being completed in 30 seconds or less, showing that users of My SportApp would be able to complete a significant number of their tasks in a very short period of time. Even beyond this, there are only two tasks that took more than 60 seconds, with one of these being the sign-up process (one-time task).

However, the average number of errors is more insightful for the system’s usability as a well-designed system should have a low error rate, regardless of the relative complexity of the task. Tasks 4, 12 and 17 stand out as having higher-than-average error rates. When paired with the testing notes, we gain greater insight into the cause of these errors:

Task 4
- There was a lack of understanding of what custom fields were, so participants guessed what they were and then where to find/how to use them.
Task 12
- Participants did not notice that the selected member was placed in the sidebar, so presumed an error and tried to re-enter the name again.
- They expected to be able to send messages from the member’s profile, so went there.
- They clicked the submit button more than once (despite a loading wheel being shown inside the button). This resulted in multiple form submissions, and multiple, chargeable API calls.
  - Also present in task 10.
Task 17
- Participants did not notice personalisation options, so they correctly navigated to the page (showing logical navigation) but then navigated away.

While these three tasks have higher error rates than others, the highest average error rate across all tasks is only 2.4 further suggesting that My SportApp is well designed and easy to use, as users are very infrequently making errors in commonly performed tasks. This again implies that My SportApp would achieve its requirements of being simple and easy to use.

However, the composition of participants must be considered. Figure 92 shows the total testing duration for each member, alongside the number of errors made. This helps to quantify any pre-determined assumptions about participants.

Figure 93: A chart showing the correlation between total testing duration and total number of errors (OpenAI, 2024)

For example, participant 2 was significantly younger than other participants and was able to complete the entire testing process by far the quickest, with the third lowest error count. By contrast, participant 1 was clear before testing began that they are ‘not good with technology’, which likely explains their high error count. Taking the participants’ characteristics into account can help to realise adaptations that could be made to the system to suit each type of user, while still ensuring that most decisions are made to target the majority of users (medium technical ability, middle aged).

Taking all of this into account, it is acknowledged that while the data obtained from the testing phase provides valuable insights into the usability of My SportApp, such testing does have inherent limitations. In addition to those already discussed, another key limitation of this testing and the ability to thoroughly evaluate My SportApp is the lack of access to other platforms. This is due to all competing platforms being paid solutions, meaning that the same tests cannot be performed against these systems without a paid subscription. This results in My SportApp being analysed in isolation – without comparative data, it is difficult to definitively state that My SportApp is ‘better’ than the competition.

Finally, System Usability Score data. This data provides more insight into the reactions of users, which is arguably more useful in this scenario due to not necessarily needing a comparable system to test against for the feedback to be useful. Figure 93 overlays SUS scores onto the error/duration data already discussed. The SUS score (shown in blue), follows a similar, though not identical, trend to total errors and time taken, showing that users who were able to complete tasks quicker and/or with fewer errors were more positive in their feedback of the system. This does highlight a weakness in SUS testing where users seem to base their responses solely on their own experience, even for questions that describe other user groups. This solidifies the importance of quantitative data, such as TCT and EROs, in combination with this qualitative data.

Figure 94: A chart showing the correlation between total time to complete, total number of errors and SUS scores (OpenAI, 2024)

We can see that the lowest SUS score was 67.5, with the average being 85.25. While 67.5 is a marginal high score (according to Bangor, et al. (2008)), the average falls firmly within the ‘Excellent’ rating, supporting the TCT and ERO data’s suggestion that My SportApp is, as a whole, well designed and usable.

SUS Feedback (free text) does not indicate any consistent feedback, which signifies that there are no key deficiencies within My SportApp, only subjective suggestions for improvements. Such feedback include:

Clicking on the top-level menu should bring up a page with tiles on, instead of the default action.
When performing actions that require a force logout (such as personalisation), a popup should appear to make it obvious to users why they are being logged out.
Renaming of certain terms.
Navigation bar should be always visible (sticky).
Navigation bar should not use icons for Settings/My Account, should still use full words.
“If I got stuck, I would use the live chat button.”

While all of this feedback is useful and worth exploring more in Section 7, it is good to see that there isn’t any standout negative feedback, only suggestions for very minor improvements to the system that would benefit that individual user’s workflow.

Conclusions

In conclusion, the development and subsequent evaluation of My SportApp has brought forward clear insights into its usability, functionality and overall effectiveness against its requirements. Through rigorous testing methods, as previously discussed, several key findings have emerged:

Firstly, by carefully prioritising requirements against timeframe and other constraints, My SportApp was able to meet all ‘must-have’ requirements, allowing it to be a ‘fit-for-purpose’ solution for grassroots sports teams to manage their day-to-day operations. Only one requirement was not met – the implementation of push notifications – and this was a ‘could-have’ requirement.

Secondly, while comprehensive direct comparisons between other systems could not be made due to a lack of access to said systems, My SportApp shows a clear advantage in its accessibility compared to other solutions. The efficient task completion times and minimal errors seen during testing have showcased My SportApp as having a robust, user-centric design.

In addition, the collection of user feedback has provided valuable insight into future enhancements and highlighted areas for improvement while maintaining a positive overall response to the system from users.

Overall, the results of the test phase confirm that My SportApp is effective as a sports management platform and fulfils its intended purpose of facilitating seamless communication and organisation in grassroots sports organisations. The results confirm the project’s success in meeting its objectives and delivering a user-friendly solution, even with the inherent limitations of the evaluation process, notably the lack of comparative data between paid platforms.