Authorea, an organization focused specifically on research document creation and conversion, has experience in every mainstream format for academic writing (LaTeX, Word, MarkDown, and RichText) and has built a system supporting many combinations of import-to-export conversion flows. The core conversion tasks of Authorea happen at automatic import or submission of a document. At the same time, the document remains accessible to editors, typesetters, and most importantly, authors, during the main writing flow. Edits, comments and suggestions can be made easily without any knowledge of XML or the underlying document conversion process.  We seek to develop and maintain the following open source components which would allow individuals, publishers, and scholarly organizations to take advantage of and build upon the in-house technology of Authorea -- writing scholarly research documents that are web-native, machine readable, discoverable, and data-rich.

An Open API

Authorea seeks funding to develop a RESTful API with OAuth 2 authentication, which will allow any organization or individual to extract content from, and deposit into, Authorea in a variety of formats. Development of the API will allow individual academics and organizations to use Authorea as a method to convert documents in a many-to-many array of formats, optionally styled to thousands of vendor styles or custom form factors. For example, bioRxiv would be able to utilize an open-source API to completely run their submission and publication process via Authorea, in a manner indistinguishable from that of a leading academic publisher. While the API would be the engine of conversion and publication, bioRxiv could build atop any peer review, and metrics they wish.
Proposed timeline: 4 months development/maintenance

A toolkit for converting Authorea documents into JATS-compliant XML

Authorea seeks funding to develop and refine the conversion process of its documents into publisher-grade JATS XML. In combination with opening our ingestion system API and offering an open specification of our internal formats, the implied boost will benefit the entire ecosystem of academic representations. We offer to effectively enable reliable and exhaustive conversion from any of LaTeX/Markdown/Word into JATS XML, and additionally offer automated pipelines for interacting with web vendors that interoperate via these languages.
Authorea is currently focused on making it easier for researchers to write and collaborate on documents online within the current publishing paradigm. As such, our end users have no direct need for JATS-XML and we have not devoted resources towards enabling this capacity. That said, JATS-XML conversion is readily achievable by Authorea, and opening up that capacity will have significant payoff for preprint servers, publishers, and institutional repositories. Funding will be used to develop this process for near-exhaustive format coverage, provide a stress test suite for quality assurance, and bootstrap an API endpoint for the JATS output target.
Proposed timeline: 4 months development/maintenance

An open specification for data-driven scholarly writing

The fundamental enabling technology of Authorea comprises a set of guidelines and conventions for organizing, managing and representing scholarly works and their editing lifecycle. We seek funding to formalize and open these best practices as a cohesive specification for scholarly writing, enabling all of: collaborative writing; powerful version control and history management; academic import/export capabilities; and continuous integration and synchronization with third-party services. As Authorea already aggregates many best-in-class techniques from the state of art, such as the git model of version control, and uses exclusively open formats for its internal representations, we will specify scholarly writing as a logical building block of the open scholarly stack. The specification will cover file system organization, document and auxiliary representations, common workflows and best practices.  
Proposed timeline: 4 months development/maintenance

Expenses to devote towards building and maintaining partnerships/marketing. 

To ensure the successful delivery and implementation of the open source components proposed for development, marketing and partnerships will need to be formed amongst various stakeholders in the community.  While Authorea is currently working towards many of these partnerships already, as part of normal business practice, extra efforts will need to be put forth towards the adoption, maintenance, and enhancement of the open source components.
Proposed timeline: Ongoing
 

Development Team

The lead Authorea developer is a co-author of LaTeXML, a semantics-preserving transformation system from TeX/LaTeX to XML (JATS/HTML5/ePub/open-ended) with nearly a decade of experience in representation transitions and mathematical knowledge management. Furthermore, several developers on our team have in-depth knowledge of Pandoc, of which Authorea maintains an open-source fork with extensions related to Word (docx) and toolchain upgrades. Authorea has a history of contributing back to the open-source community when our modifications reach maturity, and we have enjoyed a very productive relationship with the maintainers and developers in the format-conversion FLOSS ecosystem.
We seek to hire 3 full time developers to complement our existing in house development team for building and maintenance of the outlined open-source components.