-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
We have been discussing setting up analytics on the website for a while (forum: Add Plausible Tracking to DjangoProject.com?). At this point I think there is a pretty clear case for it and it’s more of a matter of deciding what tool(s) we could use and setting them up.
So I think opening this as a GitHub issue will help make this more of a task that we want to happen, not just an idea.
Tasks
- Discuss analytics use cases and potential problems
- List possible approaches
- ⌛️ Try out 2-3 promising approaches
- Set up the most relevant options on the site
- Access for website WG, fundraising WG, Board, and any other relevant stakeholders
Possible approaches
We want something that does not require cookies, is compatible with privacy laws (GDPR, CCPA), ideally has minimal to no impact on performance for site users, ideally allows us to get good reports with little efforts. I think there are three approaches that are viable:
- Use the existing Django server logs (behind Fastly caching) as an incorrect but likely interesting data source. We were hoping to do this for Docs search: tweak results ranking so release notes have lower priority #1628 but this never happened.
- Reach out to Fastly to see if we could get access to their analytics product, and use that.
- Set up a dedicated JS-based analytics beacon
I have no experience with Fastly analytics / logs tools, but from my experience with Cloudflare’s equivalent this is pretty viable to understand popular pages and high-level geographic audience details. That’s about it. Big benefit is it doesn’t require any additional, potentially invasive tracking.
For the dedicated analytics / tracking tools, likely options:
- Plausible self-hosted (used by the PSF for their docs)
- Plausible SaaS
- Cabin (has a very generous free tier)
- Google Analytics (industry standard but requires cookies)
Those come with the clear drawback of loading more code for users, and tracking more data than we would often want. But I think there are ways to mitigate that - picking the least-intrusive option, or only having tracking turned on occasionally.
Use cases
Just reiterating the use cases we have - here is what we want to know:
- Rough geographic distribution of our audience across countries
- Page views / Session counts site-wide
- Top landing pages
- Hits per search queries
- Bounce rate of specific key pages (fundraising, docs, etc)
To help illustrate the benefits of this kind of data, I wrote a blog post about the Python docs analytics: What we can learn from Python docs analytics. For Django, we would use this data to:
- Know what information is so popular it might warrant more attention / restructure
- Decide which translation efforts to encourage
- Revamp the docs based on common searches
- Understand where there is friction in our donation flow
Relevant existing issues that this data would support:
- Docs search: tweak results ranking so release notes have lower priority #1628
- Website redesign #2287
- Delete Committees page #2257
- Non-docs content enhancements from user research #1505
- Documentation search results relevance improvements #1097
- Improving the organization of information #1495
- Optimize the donations page #1500