-
Notifications
You must be signed in to change notification settings - Fork 536
Apache Incubation Proposal
This is an in progress potential proposal to the Apache Software Foundation Incubator which would start the process of Annotator becoming (after the incubation process is complete) a top-level Apache project.
We are currently discussion the if/when/how's of this proposal on the mailing list
Based on the Apache Incubator Proposal Template. Block quotes are descriptions of the section pulled from the proposal template.
A short descriptive summary of the project. A short paragraph, ideally one sentence in length.
Annotation enabling code for browsers, servers, and humans.
A lengthier description of the proposal.
The Annotator community seeks to build a foundational set of libraries under a liberal license providing the pieces necessary for developers to add annotation to their projects.
Provides context for those unfamiliar with the problem space and history.
Annotator.js was originally created by Open Knowledge (formerly The Open Knowledge Foundation) to provide annotation over works by Shakespeare. Since that time, Annotator has found its way into a wide range of browser-based annotation systems such as Hypothes.is, LacunaStories.com, and various academic, publishing, and scientific research projects.
Sadly, this increased usage has primarily happened in forks of the main code or through copy-left licensed plugins that prevent their use by many community members.
However, the community remains interested in combined collaboration and interested in a foundational future for annotation--both in browsers as well as servers and desktop/mobile applications.
Explains why this project needs to exist and why should it be adopted by Apache.
Annotation is often implemented in projects in ad hoc ways with developers often re-solving problems well known to the Annotator community. The Annotator community works to provide knowledge and code to help developers more quickly implement or improve annotation within their projects.
We believe bringing the Annotator community into the Apache Software Foundation will allow for wider recognition of the annotation problem space, help more developers find their way to solving this shared problem, provide increased cohesion for our own somewhat fractured community, and increase the use of commonly shared code within a wide range of projects.
- create a collaborative space for the existing Annotator contributors and community
- further ignite interest and activity around annotation
- build foundational libraries for annotation
- implement code to support the Web Annotation Data Model, Protocol, and other annotation related specifications
- potentially re-license Annotator under the Apache License 2.0
- Annotator is currently licensed under a combination of the MIT & GPL
- consolidate (where possible) community activity around building add-ons, annotation storage providers, and use-case specific feature sets
- grow interest and activity in annotation
Apache is a meritocracy.
The project is in transition from a primarily BDFL-based model to one with a more diverse set of committers. There are 36 total known commiters to Annotator. 3 commiters having done the bulk of the coding and decision making. 2 of those commiters acting as project leadership.
However, the community is much larger and more diverse when the various forks and plugin authors are considered.
We intend to invite and include participants from a wide array of annotation problem spaces to collaborate in this new shared space.
Apache is interested only in communities.
Community calls had been being done every 3-6 months with reports of the calls outcome being posted to the mailing list and the annotatorjs.org website.
Most activity within the project happens on the mailing list. There is also a relatively inactive #annotator channel on irc.freenode.net. The website is primarily for promotion and includes promotion of community plugins and showcases projects using Annotator. Documentation is published on readthedocs.org and linked to from the website.
There are many Annotator and W3C Annotation Data Model related projects found on GitHub. Our objective would be to invite these communities to join this collaborative community with the hope of greater stability and community longevity.
Apache is composed of individuals.
The 3 primary committers to the project are Nick Stenning of The Hypothesis Project, Randal Leeds of Medal, and Aron Carroll of Dropbox, Inc. Nick Stenning is the original creator of Annotator. Randall Leeds is an Apache CouchDB committer. Aron has been a frequent contributor. All three have been members of The Hypothes.is Project in past years.
Other currently active community members include:
- Andrew Magliozzi of FinalsClub.org
- Andrew drives the scheduling of community calls, is active on the mailing list, and encourages progress within the project and community
- Benjamin Young of Wiley (also formerly of The Hypothes.is Project)
- an Apache CouchDB commiter
- co-editor of the Web Annotation Data Model
- Oliver Sauter of WordBrain
- active advocate for Annotator and the growth of the annotation community
Other committers have contributed significant amounts of code, content, or issues and discussions, but are currently (in the last 3-6 months) less active on the project. However, at recent annotation related conferences the scale of the plugin, fork, and ancillary project activity was shown to be much higher than what was apparent from activity on the main Annotator mailing list--in part due to community fracturing...something we hope to fix with joining the ASF.
A full list of Annotator contributors can be seen here: https://github.com/openannotation/annotator/graphs/contributors
Describe why Apache is a good match for the proposal.
The Annotator community believes that the Apache Software Foundation promotes and enforces the sort of community that will best serve the future of the project. It is also believed that Annotator can serve the ASF by providing its tools to bring annotation into various Apache projects and eventually to the apache.org site, project documentation, and other tools within the ASF.
The priority is on increasing community involvement, defining--via the Apache Way--how we will code and collaborate going forward, and upon creating the best possible annotation solution born out of that collaboration.
An exercise in self-knowledge. Risks don't mean that a project is unacceptable. If they are recognized and noted then they can be addressed during incubation.
A public commitment to future development.
The majority of the core committers are from The Hypothesis Project which uses an earlier version of Annotator within it's annotation web service and BSD-licensed h
annotation software. Hypothesis is a leader in the web annotation service space with activities in the W3C Annotation Working Group and various partners exploring implementating annotation within their projects.
However, the reliance of the community on a single, primary contributing organization is both a concern of the community and of The Hypothesis Project--who's goal is to bring annotation back to the Web, not merely to increase use of its service or software.
As such, the Annotator project has begun the process of becoming an Apache project to establish a development and community process that encourages diversity and cross-organization collaboration.
Annotator was established as an Open Source project in 2011 with it's first, v0.0.1 release being made on January 1st of that year: https://github.com/openannotation/annotator/releases/tag/v0.0.1
The project has continued since that time as an open source project developed on GitHub. The community has grown in diversity since that time and was moved into a separate "openannotation" GitHub organization (from the original "okfn" GitHub organization) in 2014 in an effort to increase community involvement and diversity.
Each of the core committers have worked on and created open source software for themselves or various organizations for the greater than 5 years.
Healthy projects need a mix of developers. Open development requires a commitment to encouraging a diverse mixture. This includes the art of working as part of a geographically scattered group in a distributed environment.
Currently, the active core contributors are all from a single organization, The Hypothesis Project--a non-profit who's charter is to build open source annotation software.
Active community members as well as plugin and compatible annotation storage system builders are from a much more diverse range of organizations and individuals.
The Annotator community is seeking to deversify the core group of committers to more accurately reflect the diversity of the community and those using Annotator within their projects.
Geographically, the Annotator community is widely distributed from Germany, Hungary, the East and West coasts of the US, and Australia.
A project dominated by salaried developers who are interested in the code only whilst they are employed to do so risks its long term health.
The contributors to Annotator project from The Hypothesis Project do include Annotator contributions as part of their work for Hypothesis. However, their contributions are not solely directed by the needs of The Hypothesis Project, and often include code and community time done on the personal time of the developers.
However, this is still considered a risk and the community is actively pursuing increased participation from non-Hypotheis Project employees.
Apache projects should be open to collaboration with other open source projects both within Apache and without. Candidates should be willing to reach outside their own little bubbles.
The Annotator community also provides an annotation storage system ("annotator-store") built upon ElasticSearch. There are compatible implementations of that API built on various storage systems (including Apache CouchDB), and the community would encourage the creation of other compatible storage systems built upon other Apache storage projects.
Additionally, Annotator is a JavaScript library which could serve any of the various CMS projects within Apache.
The roadmap for Annotator also includes compatibility with the Web Annotation Data Model--which is an RDF-based conceputal model for annotation with a JSON-LD-based serialization. The growing number of RDF-focused Apache projects could take advantage of and contribute to the creation of these features.
Lastly, Apache UIMA can currently generates Open Annotation Data Model annotations as an output of it's Natural Language Processing system. These annotations could be provided to Annotator (via plugins or a storage provider) and displayed via the Annotator UI--which could further leverage user interaction with those NLP-based annotation (such as confirmation, rejection, or modification of the annotations made by Apache UIMA's NLP process).
Concerns have been raised in the past that some projects appear to have been proposed just to generate positive publicity for the proposers. This is the right place to convince everyone that is not the case.
The Annotator community acknolwedges the value and recognition that the Apache brand would bring to the Annotator project. However, the primary interest is in the community building process that the Apache Software Foundation provides.
We do hope that "Apache Annotator" would increase the recognition of and contribution to the Annotator project. However, use of Annotator has continued to grow over the last several years with increased recognition from various other projects, products, and services.
Integrating those developers into the Annotator community and adding them as contributors is seen as a higher priority then increasing awareness through branding.
References to further reading material.
Website:
Documentation:
Mailing List:
Code:
Plugin index:
Describes the origin of the proposed code base. If the initial code arrives from more than one source, this is the right place to outline the different histories.
The original Annotator code base was created by Nick Stenning while at the Open Knowledge Foundation. The code has been in development since before 2011 with the first public release (v0.0.1) happening on January 1st, 2011 on GitHub.
The example annotation storage system (which works with Annotator's stock Store plugin) had it's first release in February 21, 2011 and was originally built for Apache CouchDB. The contributor list of annotator-store is similar, but the license is simply the MIT (rather than MIT & GPL). The stated copyright is 2010-2012 Open Knowledge Foundation.
Complex proposals (typically involving multiple code bases) may find it useful to draw up an initial plan for the submission of the code here. Demonstrate that the proposal is practical.
The Annotator community is beginning the process of switching to the Apache License 2.0 from the MIT & GPL dual-license. The result will be a clearer license scenario and a stated copyright owner of "The Annotator Community"--currently, the copyright owner is unstated, so the contributions are believe to be "in kind" and the copyright owned by the various contributors.
The annotator-store project is under a clearer, single BSD license. The copyright holder is stated to be the Open Knowledge Foundation with the years 2010-2012. The intent is to move this to the same Apache License 2.0 with the copyright declaration being "The Annotator Community."
The process of collecting relicensing permission from known contributors is underway via the mailing list and GitHub issues--using a model similar to Twitter's when it relicensed Bootstrap.
The Annotator community is hoping to clarify this further through the Apache incubation process.
Annotator depends on the following JavaScript modules from NPM:
- backbone-extend-standalone - MIT
- browserify-shim - MIT
- clean-css - MIT
- enhance-css - MIT
- es6-promise - MIT
- insert-css - MIT
- jquery - MIT
- through - MIT / Apache License 2.0
- xpath-range - MIT + GPL-3.0+ Dual License
annotator-store depends on the following Python modules:
- elasticsearch - Apache License 2.0
- PyJWT - MIT
- iso8601 - MIT
- six - MIT
Note: the Annotator community currently uses a single list hosted by Open Knowledge at: https://lists.okfn.org/mailman/listinfo/annotator-dev
Note: the Annotator community hosts its code on GitHub as part of the "openannotation" organization:
The Annotator community would prefer to continue using GitHub Issues if that is a possibility.
- static website hosting for annotatorjs.org
- Nick Stenning ([email protected])
- Randall Leeds ([email protected])
- Aron Carroll ([email protected])
- Andrew Magliozzi ([email protected])
- Benjamin Young ([email protected])
Nick Stenning, Randall Leeds, and Benjamin Young are currently employees of The Hypothesis Project.
Aron Carroll currently an employee of Dropbox, Inc.
Andrew Magliozzi currently an employee of FinalsClub.org
TBD
TBD
The Sponsor is the organizational unit within Apache taking responsibility for this proposal. The sponsoring entity can be: the Apache Board, the Incubator, another Apache project
The Incubator