There is a real sense of friction in much of the conversation around the potential of GenAI and its intersection with open education and open education resources (OER). This friction in part lies in how the most widely available commercial products were created e.g. trained on enormous datasets including works protected by copyright. Whilst this training is claimed as legal under fair-use provisions the scale and volume do not feel fair to many, and in academic contexts the at-scale use of copyright texts with no attribution runs counter to accepted good practice as scholars, where plagiarism is a deeply serious matter. Creating OER using the same GenAI products that we know are being used to falsify academic work and spread misinformation can feel weird, and the ability to apply an open license safely to an OER in part created with GenAI is not a settled or simple matter though good guidance exists. This is a complex and emergent legal space in which education systems have a big stake, as we regularly rely on the same definitions of fair-use for our own purposes.
Already we see a number of legal challenges to the claim of fair-use emerging from various copyright holders objecting to the use of their works. Claims range from the use of copyrighted materials without compensation, appropriate attribution (where the original material was made available under Creative Commons licenses for example), to where content was acquired illegally (from shadow libraries), or where the output of an AI might be reputationally damaging (where corporate logos appear on content the company would not endorse). Whilst some copyright holders are choosing to pursue legal action, there have also been a number of licensing deals made with publishers, particularly in the news and journalism space. In addition to granting permission to use content in return for prominent placement, advertising revenue share, or improved attribution, included in a number of the press releases is the mention of collaboration on the development of new AI products, either for internal purposes or as a new or enhanced product for customers.
Academic publishers are no exception and recent news about licensing deals with Wiley and Taylor & Francis has created yet more unease amongst the academic community, where many see commercial GenAI systems as antithetical to their values as educators even as institutions consider licensing commercial GenAI systems as part of institutional infrastructure. This unease often extends beyond copyright concerns to issues of bias, privacy, environmental harms, continued commercial gain at the expense of a knowledge commons, and the extent to which use of GenAI risks devaluing the role of human thinking. As with other forms of educational technology, questions about the extent to which commercial GenAI tools are being built to solve problems within education specifically, or informed by the perspectives of those within education systems remain consistent.
Notable within the context of open education is the recent partnership between OpenStax and Google, making OpenStax OER resources available via Google’s Gemini product. Like being able to constrain a Google image search to openly licensed images, Gemini users can now limit requests to just OpenStax resources, with the intention of improving the accuracy and trustworthiness of results. Similarly the University of Michigan have leveraged their relationship with Microsoft to build their own suite of closed GenAI tools on top of ChatGPT, specifically to ensure that their standards around privacy were maintained, as well as making GenAI tools accessible and available equitably to the entire institution. They now intend to share their learnings and possibly even technical expertise with other institutions.
This kind of experimentation is necessary but likely not sufficient to fully avoid new barriers to equitable access and inclusion in education, or new forms of enclosure of knowledge. The challenge as we build our capacity is to think beyond what we can build with commercial GenAI products or how to further tune them to our purposes, and towards what an education-focussed GenAI commons of openly shared projects, technologies, and systems might look like. We need to think beyond how we can produce a greater variety of OER content artefacts more cheaply and at scale or improve the accuracy of existing commercial tools, and towards open GenAI technologies as digital public goods.
Openly licensing the complex prompts that we build is one place to start, but in line with the aims and objectives of the 2019 UNESCO Recommendation on Open Educational Resources (OER), we ought to be considering where to collaborate at sector level on the development of open GenAI technologies and systems that can support the development of OER (amongst other things) for the benefit of public education systems, owned and governed by our community, and in alignment with our responsibilities and values.
What might open equivalents of interactive textbooks, able to respond to requests to summarise or synthesise content look like? Especially where they are built to work on everyday computing hardware in places where internet connectivity is expensive and unreliable? What could an education-specific aggregation service like Poe do for supporting innovation and improving the discoverability of open GenAI tools and sources of expertise, advice, and collaboration?
The idea of an education-focussed GenAI commons needs significant amounts of money, resources, and expertise and this is no small matter in education systems that are increasingly resource constrained, but this is even more reason to argue that fundamentally we need a mechanism to be able to easily share and work together. Creative Commons licenses allow us to share and re-use educational materials, widening access to participation and making best use of public money. Likewise Open Source Software licenses have enabled a permissive and frictionless ecosystem for technological innovation that has given the OER community important tools like Pressbooks or Manifold.
As a committed open education advocate, this is why I’ve been involved as a member of the Board at the Open Source Initiative (OSI) in the work we’ve been doing to define “open source AI”. Building on OSI’s expertise as stewards of the Open Source Definition (on which all open source licenses today are based) we have developed the Open Source AI Definition which explains in detail exactly what is required for an AI system to convey to its users, developers and deployers the freedom to use, study, modify, and share for any purpose. Version 1.0 of the definition was released in October 2024, and was developed through a global multi-stakeholder co-design process that includes technologists, public interest groups, policy makers, legal experts, and academics. This definition, and the legal documents that will in time be based upon it, are key pieces of the sharing infrastructure we need to enable a meaningful open GenAI commons for education. We fully expect that as the social, legal, and technical landscape around AI evolves, so will this definition and I think it’s imperative that open educators are part of that conversation.
GenAI has already transformed our education systems, not least by forcing us to reflect on our current practices. Ensuring that we move forward in alignment with the values of open education and do not instead step backwards into greater enclosure of knowledge and increased costs for access to and participation in education requires that we don’t simply learn to use GenAI technologies, but that we collectively own and have agency over them, as we do with OER today.