Open Source Licenses for Humanitarian AI/ML Models?
Lately, we've been thinking about all the work that humanitarian organizations and third-parties are involved in connected with developing, training and testing machine learning models and one subject that has repeatedly come up is that of licensing AI/ML models.
For a host of important reasons humanitarian organizations might prefer to closely manage use and licensing of their intellectual property especially while the humanitarian AI field is so young and in many areas untested. Or organizations, third-parties and volunteers might like to "open source" their work, giving others opportunities to collaborate on projects and help broaden, advance and improve applications of artificial intelligence.
To kick-off thinking on the subject, we'd like to share a link to an article on a webinar series launched by the Open Source Initiative (OSI) titled: "Should we use open source licenses for ML/AI models?" Mary Hardy , corporate counsel at Microsoft, offers her input on open source licensing via a viewable presentation. The article and presentation can provide humanitarian actors with a primer on open source licensing for AI/ML projects and on core ML components.
Currently one popular license used by researchers for their AI/ML projects is: AFL-3.0. In a two paragraphs, Perplexity AI summarizes AFL-3.0 and why it is so popular as follows:
"The Academic Free License (AFL) v3.0 is a permissive open source license that allows free use, modification, and distribution of the licensed work, both for commercial and non-commercial purposes. It grants recipients a copyright license and permits patenting of modifications, as long as the original copyright notices and attribution are preserved.
AFL-3.0 is popular because it provides flexibility similar to the BSD or MIT licenses, without requiring reciprocal source code disclosure for derivative works. This makes it suitable for projects that want to enable commercial adoption and integration with proprietary code. Additionally, its clear language written by an open source legal expert and its approval by the Open Source Initiative contribute to its widespread use."
Homework-wise for the humanitarian community, we need to begin to think about licensing for our own humanitarian AI/ML models and how to manage intellectual property generated by humanitarian actors and their partners. As open source licenses for machine learning (ML) models developed for the humanitarian community may have some unique considerations compared to traditional software licenses. According to Perplexity.ai:
Responsible AI Restrictions
Humanitarian ML models often deal with sensitive data like health records, biometric data, or information about vulnerable populations. To mitigate potential misuse, licenses like the Responsible AI License (RAIL) can include use restrictions prohibiting the model from being applied in unethical ways that could cause harm. For example, banning use for surveillance, discrimination, or human rights violations.
Recommended by LinkedIn
Non-Commercial Focus
Many humanitarian projects prioritize non-commercial use to ensure the models benefit society rather than corporate interests. Licenses like the CreativeML Open RAIL-M allow free use for non-commercial, research, and charitable purposes while restricting commercial applications without a separate agreement.
Accessibility Requirements
To promote equitable access, especially in underserved regions, humanitarian ML licenses may require making the model and training data openly available. This aligns with principles of digital public goods and avoids privatization of resources meant for the public good.
Enforceability Mechanisms
Responsible AI licenses aim to be enforceable through mechanisms like publication of model fingerprints, third-party auditing, and revocation of rights in cases of misuse. This enforceability is crucial when dealing with high-stakes humanitarian applications. In essence, humanitarian ML licenses strive to balance openness with responsible and equitable use aligned with the project's social mission, rather than maximizing commercial interests. Achieving this balance is a key challenge.
This is some good input too!
Baobab Tech | AI solutions for social impact | Founder of WASH AI
6moThis might work in the ML space for smaller ML models, but one key challenge we are going to see in the LLM use for Humanitarian systems (even with SLMs) - is that humanitarian organizations can't afford to train an LLM from scratch that could come near to the capabilities of common models (e.g. the popular llama-3-70b likely cost in excess of $700m, the 8b version around $100m in GPU costs alone) - and these models have a fraction of the capabilities of ChatGPT type models that everyone is tinkering with. Most humanitarian LLM systems will be using fine-tuned models. We are already seeing a monopoly in the fine-tuning space with 1 or 2 foundation models dominating, with lama-3 by Meta being omnipresent. It's a great model, so it makes sense, but there has been concerns around the custom licensing they provide for it - many calling for Apache 2 or MIT license instead. Even Mistral recently shifted. With big tech taking over the AI space (away from researchers), we need a global shift towards investing in open research with governments directly contributing to research so that larger fully open source models are available for the humanitarian space. Nathan Lambert has some great content on open versus.. open.
Digital Development Leader - Accelerating Engagement and Impact with Communities Worldwide
6moDaniel Wilson food for thought