There are two ways of spreading light: to be the candle or the mirror that reflects it.
– Edith Wharton, Vesalius in Zante
On April 14, the US Patent and Trademark Office awarded us a trademark on HMMER. This is a good moment to explain how we plan to deal with intellectual property.
HMMER is scientific software, and its methods are described in journal publications. That means that it must be made available in a form that enables any scientist to understand, reproduce, and extend – like any other result of a scientific paper. For software, this is essentially the same as what people mean by “open source”. Our intent is to make HMMER widely and freely available to the entire scientific community as open source code. At the same time, we have to recognize that HMMER is a large, growing, and increasingly valuable codebase, not just a one-off result, so we’re taking steps to make sure we can sustain it as a long term, coherent open source project.
HMMER3 is licensed under the GPLv3 (GNU General Public License, version 3). This means anyone can use it, study it, modify it, and even redistribute it — with the requirement that any modified/redistributed versions must also be licensed under the GPLv3. This explicitly includes both “noncommercial” and “commercial” use (whatever that means, in these days of multibillion dollar research universities and garage biotech startups). People at companies are scientists too, with the same rights and responsibilities regarding results in the scientific literature. The only thing the GPLv3 really blocks is someone forking a derivative copy of HMMER and distributing it under a different license, such as a closed-source proprietary license; to do that, you’d need to negotiate a non-GPL license with us first.
We really don’t expect to negotiate any non-GPL licenses, though. We want to enable many different people to contribute to a single open source HMMER codebase, as a shared codebase for bioinformatics and computational biology. Having a lot of contributors means having a lot of copyrights. Many different copyrights already apply to HMMER, and we plan for even more; negotiating with all those copyright holders to obtain a non-GPL license will be prohibitive (for me, if not for you). It’s relatively easy to get everyone to agree to donate their copyrighted code under license terms compatible with the open source GPLv3, and that’s what we plan to do.
The Howard Hughes Medical Institute (HHMI; my employer) is the main copyright holder. Our terrific Hughes lawyer, Heidi Henning, has negotiated with Washington University (St. Louis USA) and the Medical Research Council Laboratory of Molecular Biology (MRC-LMB; Cambridge UK), my former employers, to transfer their copyrights to HHMI. We also have bits of code from a number of other sources, including Apple, IBM, and some other companies of various sizes, and several individuals in the comp bio community.
Did I mention, we want to enable a single open source HMMER codebase? There are several different “HMMERs” out there, some of which have forked HMMER code, some of which say they are independent implementations, and some of which aren’t very clear what they are. I don’t think this confusion around the name is useful for the community, and frankly, I find it somewhat annoying that people are forking rather than working together. (I also disbelieve that it is possible to independently implement HMMER, because there’s so much unpublished trickery in the code; so as far as I’m concerned, either a faux non-open HMMER is getting different and probably wrong answers and making me look bad, or it’s infringing my work and my license and making me mad.) Especially now with the advent of HMMER3, these other “HMMERs” are obsolete, imho.
To help drive cohesion of a single codebase, we have trademarked HMMER. I would now ask anyone who is distributing something called “HMMER” that is not HMMER to change their name to something else, in order not to confuse people. We will soon start “enforcing” the HMMER trademark with some friendly letters, if needed — these letters will be requests to work together on a common codebase. Of course you are still free to use the codebase under the terms of the GPLv3 for whatever you want — just please don’t call modified versions “HMMER”. If you make useful modifications, please consider contributing them back to HMMER instead. We think the “brand recognition” of HMMER is going to help motivate people to cooperate rather than fork.
Part of this plan involves us taking on more responsibility — we are making a commitment to spending time and effort on integrating useful modifications into the HMMER codebase. For example, I’m already making plans with Bjarne Knudsen and CLCbio (Copenhagen, Denmark) to work together to make sure that CLC will be able to integrate the open source version of HMMER3, rather than needing their own version.
We have debated defensively patenting the key innovations in HMMER3, but decided against it. HHMI, to its great credit, is perfectly prepared to file patents solely to defend the intellectual turf of an open source software tool — that is, if we were to be challenged by some commercial patent holder on something, we could fight fire with fire. In the end, though, I feel that software patents on published scientific results are sufficiently controversial and in conflict with the openness required of published scientific results that we decided we didn’t want to go there.
We are prepared to license and incorporate other people’s patented technologies in HMMER3, if necessary. The first example of that is the incorporation of patent-pending technology from Michael Farrar, which I use at the heart of HMMER3′s SIMD vector acceleration code. We licensed that technology nonexclusively from Michael specifically limited to its use in HMMER open source code, and we’ll do that with other future technologies as needed. The “patent clause” of the GPLv3 automatically conveys a nonexclusive license to you, and on through to derivative works. This means that you don’t have to do anything; the GPLv3 is automagically taking care of patent issues, once HHMI and I have done the right licensing up front. This is a big reason why I’m using the GPLv3.
A lot of thought has gone into our positions on HMMER’s intellectual property, thanks especially to discussions with HHMI lawyers and staff (Heidi Henning, Seth Brown, and Joanne Theurich). We think we’ve got this right, for a sustainable long-term plan of open source software development that benefits the whole community. But if you have comments or criticisms, this is a good time to hear them.