Skip to content

Site usage analytics #2288

@thibaudcolas

Description

@thibaudcolas

We have been discussing setting up analytics on the website for a while (forum: Add Plausible Tracking to DjangoProject.com?). At this point I think there is a pretty clear case for it and it’s more of a matter of deciding what tool(s) we could use and setting them up.

So I think opening this as a GitHub issue will help make this more of a task that we want to happen, not just an idea.

Tasks

  • Discuss analytics use cases and potential problems
  • List possible approaches
  • ⌛️ Try out 2-3 promising approaches
  • Set up the most relevant options on the site
  • Access for website WG, fundraising WG, Board, and any other relevant stakeholders

Possible approaches

We want something that does not require cookies, is compatible with privacy laws (GDPR, CCPA), ideally has minimal to no impact on performance for site users, ideally allows us to get good reports with little efforts. I think there are three approaches that are viable:

  1. Use the existing Django server logs (behind Fastly caching) as an incorrect but likely interesting data source. We were hoping to do this for Docs search: tweak results ranking so release notes have lower priority #1628 but this never happened.
  2. Reach out to Fastly to see if we could get access to their analytics product, and use that.
  3. Set up a dedicated JS-based analytics beacon

I have no experience with Fastly analytics / logs tools, but from my experience with Cloudflare’s equivalent this is pretty viable to understand popular pages and high-level geographic audience details. That’s about it. Big benefit is it doesn’t require any additional, potentially invasive tracking.

For the dedicated analytics / tracking tools, likely options:

Those come with the clear drawback of loading more code for users, and tracking more data than we would often want. But I think there are ways to mitigate that - picking the least-intrusive option, or only having tracking turned on occasionally.

Use cases

Just reiterating the use cases we have - here is what we want to know:

  • Rough geographic distribution of our audience across countries
  • Page views / Session counts site-wide
    • Top landing pages
  • Hits per search queries
  • Bounce rate of specific key pages (fundraising, docs, etc)

To help illustrate the benefits of this kind of data, I wrote a blog post about the Python docs analytics: What we can learn from Python docs analytics. For Django, we would use this data to:

  • Know what information is so popular it might warrant more attention / restructure
  • Decide which translation efforts to encourage
  • Revamp the docs based on common searches
  • Understand where there is friction in our donation flow

Relevant existing issues that this data would support:

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions