Wednesday, April 16, 2025

Open Source Electronic Lab Notebooks (ELN) in Academic Research: Balancing Openness, Sustainability, and Institutional Readiness




The increasing complexity of research environments in higher education has prompted a shift toward digital tools that support data stewardship, reproducibility, and collaborative inquiry. Among these, Electronic Lab Notebooks (ELNs) have become central. Open source ELNs, in particular, offer alignment with the academic world's values of code transparency, FAIR data principals, low entry cost, and the possibility of customized connections to other data tools. On the other hand use of open source ELNs also raises important questions related to deployment, hosting, integration with infrastructure, sustainability, security, compliance and institutional capacity. 

Recent developments in the open source ecosystem include the appearance of enterprise grade open systems with available paid support packages and managed hosting and deployment options. These types of systems address earlier limitations of "home brew" open source ELNs, positioning the former as more viable options for institutions seeking scalable research data solutions. This article examines the advantages and challenges of open source ELNs within the academic development landscape, with attention to early career researchers, student learning, data integrity, legal compliance, cost, and reproducibility.

Affordability and Strategic Cost Management 

Open source ELNs can be free to download and use, assuming that the user has the technical aptitude to configure a complex client-server system themselves, and that they have access to a suitable environment for hosting the server and all of its data. If that is not the case, then the good news is that some modern open source ELN systems offer paid support tiers that include cloud hosting, deployment assistance, integration services, and technical assistance. This hybrid model allows small labs or even entire institutions to enjoy the benefits of a professionally managed system, but reduce dependency on high-cost proprietary software while retaining full environment and data control (Kanza et al., 2017). Some examples of systems that use this hybrid open source with paid support model are RSpace, SciNote, ElabFTW and the chemistry-focused tool chemotion. The price paid by institutions for managed deployments of these systems varies between 50 and 500 USD per user per year depending on the system, the number of users, and many other factors, but this price range is pretty reasonable when compared to commercial data management products used by large pharmaceutical companies. It’s also very reasonable considering that the acquisition, integrity, reproducibility and long-term safety of data is arguably the primary raison d'être of university research labs, and therefore, this is something that certainly should be made a top priority. Theoretically, the flexibility of hybrid open source systems also enables cost-effective pilot systems, and fast deployment of a reliable production system at scale, aligning with institutional priorities for both fiscal responsibility and digital transformation, as well as better awareness of research activities and data collection institution-wide (European Commission, 2016). It also allows smaller or less-resourced institutions to participate in the digital research ecosystem without compromising access or functionality.

Capacity Building for Early Career Researchers 

For early career researchers, open source ELNs foster habits of transparency, reproducibility, and collaboration. These tools promote structured, searchable, and shareable record keeping, supporting the development of data literacy and research integrity (Freedman et al., 2015). When paired with centrally managed support systems, ELNs can be integrated into research training and mentoring programs, encouraging best practices in data stewardship from the outset of researchers’ careers (Lowndes et al., 2017). Use of Open Source ELNs is especially desirable because of the greater range of cost strategies, deployment options and data mobility, making it easier for data or copies of data to travel with the scientist as they move between institutions, and as their rank, funding and oversight and data publication needs change over the course of their career.

Expanding Visibility and Continuity of Student Research 

A growing use case for ELNs lies in undergraduate and postgraduate education. Research projects conducted as part of coursework or dissertations often generate valuable data or insights that are not destined for journal publication. ELNs provide a platform for preserving, showcasing, and potentially publishing this work in a semi-formal context. Making student research visible in this way enhances its discoverability, value, and impact (Borghi et al., 2018). ELNs also offers professional development benefits—students can share their research data with potential employers or collaborators and revisit their work in future research contexts, enhancing research continuity and identity formation as emerging scholars (Brew, 2023). Some systems even have the ability make selected work fully available for indexing by google and other search engines such that web searches by researchers working in specialized fields will find relevant studies performed by students, possibly introducing those students to new professional connections and future job opportunities. Again, the flexibility of open source systems can offer an enhanced range of strategies for students at any point in their training and early career paths.

Reproducibility and Research Integrity

Open source ELNs support reproducibility through features such as time-stamped entries, version control, well defined, distributable protocol templates and integration with computational workflows (Karkkainen et al., 2022). These features help ensure research can be verified, reused, or built upon, aligning with funder mandates and journal expectations around open and transparent research. Universities that adopt ELNs institution-wide can foster consistent standards in research documentation, ensuring robust audit trails and institutional readiness for external evaluation or legal challenges. The lower cost of open source systems helps them compete with other cheap, but less that idea alternatives such as MS SharePoint or google docs, which although attractive and familiar, absolutely do NOT offer the same advantages for research integrity. Only a fully 21CFR11 compliant ELN system guarantees the accuracy and integrity of electronically collected digital records.

Legal Compliance and Ethical Considerations 

Historically, open source tools have struggled with full regulatory compliance, in part because to be fully compliant they often need to be hosted on a robust, validated environment and not on a graduate student’s home server, and they may need to integrate with university SSO systems in order to be sure of user identity and affiliation. However, modern hybrid systems like RSpace offer optional support packages essentially as “add on” services to help with issues like GDPR compliance, encrypted storage, backup, and university-managed access control and formal training, making them fully 21CFR11 compliant and suitable for GMP research or research involving sensitive or regulated data. Some also offer delegated, hierarchical data oversight and system administration. Ensuring institutional oversight and guidance in these areas is critical. Without proper governance, individual researchers may lack the knowledge required to manage legal and ethical risks, particularly when dealing with human subjects or biomedical data. 

Sustainability and Institutional Integration

Concerns around sustainability have long hindered the broader adoption of open source software in academic environments. Community-driven development, while dynamic, can lead to uncertainty around long-term viability. The introduction of support contracts and hosted enterprise deployments mitigates these concerns by ensuring continuity, updates, and compliance that is also managed by a professional team with a long-term financial stake in the system’s continued existence (Petrisor et al., 2021). Institutional IT teams can now treat open source ELNs as enterprise-grade services, integrating them with identity management systems and aligning them with cybersecurity policies and Research Data Management (RDM) pipelines. This opens new possibilities for institution-wide adoption, from small labs to large collaborative networks. Best of all, the engineering collaborations between the vendor and user base can greatly enhance each institution's own vertical integration objectives (Plankytė et al 2025). It's true that this may involve recruiting university software engineers to work on complex problems and spend university time contributing to the system, but this is hardly "lost time" since presumably the only alternative is creation of the features the university wants outside of the ELN, and most likely this would take at least as long. By building on top of a managed system, the university gains access to the vendor as a full partner who can assist with testing, quality control, technical advice and distribution of the new features to other universities for the benefit of all. Involvement of university developers also helps alleviate sustainability fears because developers who already know the product and are familiar with its code would be in a better position to take over development and upkeep if the vendor were to end its formal support for the system.

Conclusion

Open Source, Professionally Supported, and Pedagogically Valuable Open source Electronic Lab Notebooks now offer a pragmatic, sustainable, and pedagogically rich alternative to proprietary systems. Their alignment with open science values, coupled with new support models, makes them a compelling option for institutions seeking to enhance research integrity, student engagement, and cost-effectiveness. For early career researchers and students alike, ELNs provide not only a tool for rigorous documentation but also a platform for scholarly visibility and future research continuity. Academic developers and institutional leaders who invest in these systems—not only as technologies but as pedagogical and professional development tools—stand to cultivate a more open, inclusive, and future-oriented research culture. Of the three best known ELN systems of this type (RSpace, Scinote and ElabFTW) Scinote is focussed primarily on commercial customers, and ElabFTW is focussed mainly on smaller lab deployments, with only RSpace designed specifically for large-scale, long-term deployment to entire universities, and this is reflected in the specialized SSO, filestore and RDM integration, integration with a plethora of well-known third-party research tools, hierarchic administration and free large-scale hosting services offered as part of the standard RSpace deployment package, as well as exhaustive documentation and a thriving github community.

Join the RSpace subreddit here: https://www.reddit.com/r/RSpaceELN

References 

Borghi, J., Van Gulick, A. E., & Hodges, T. L. (2018). Support for student data management at U.S. academic institutions. Journal of eScience Librarianship, 7(1), e1124. https://doi.org/10.7191/jeslib.2018.1124 

Brew, A. (2023). Researcher development and student identity: Changing understandings of academic practice. International Journal for Academic Development, 28(1), 1–14. https://doi.org/10.1080/1360144X.2022.2132134 

European Commission. (2016). Open Innovation, Open Science, Open to the World: A Vision for Europe. https://op.europa.eu/en/publication-detail/-/publication/3213b335-1cbc-11e6-ba9a-01aa75ed71a1 

Freedman, L. P., Venugopalan, G., & Wisman, R. (2015). Reproducibility2020: Progress and priorities. FASEB Journal, 29(9), 3729–3735. https://doi.org/10.1096/fj.15-100100 

Kanza, S., Sword, S., Gibbins, N., et al. (2017). Electronic Lab Notebooks: Can they replace paper?. Journal of Cheminformatics, 9(1), 31. https://doi.org/10.1186/s13321-017-0221-3 

Karkkainen, T., Aavik, G., Vahdat, M., et al. (2022). Electronic lab notebooks improve transparency and reproducibility in research. Nature Communications, 13, 5175. https://doi.org/10.1038/s41467-022-32885-2 

Lowndes, J. S. S., Best, B. D., Scarborough, C., et al. (2017). Our path to better science in less time using open data science tools. Nature Ecology & Evolution, 1, 160. https://doi.org/10.1038/s41559-017-0160 

Petrisor, A., Gavrilescu, A., & Vlad, C. (2021). Evaluating the sustainability of digital infrastructure for scientific research: The case of open source lab notebooks. Journal of Open Research Software, 9(1), 1–10. https://doi.org/10.5334/jors.315 

Plankytė, V., Edmunds, R., and Macneil, R.: Case Study of Vertical Interoperability Between Research Tools Enabling an End-to-End Sample Workflow from Collection, to Management, to Archiving, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9046, https://doi.org/10.5194/egusphere-egu25-9046, 2025.

Tuesday, November 19, 2024

Coming in CERF 6: Improved support for using custom apps to perform mission-critical tasks and analyses on files stored in CERF.

Imagine you are a research organization that works with data files in some specialized format. A genetics lab working with GenBank .GBK or snapgene .DNA sequence files would be a good example. Now imagine your software engineers have written a custom app designed to perform some calculation or processing task on your data files, with the result or summary output to a new file.

Let's further imagine, that as a data manager, you need to have a good record of when any analyses where performed, who performed the analysis and precisely where the results of the analysis are stored. As it happens, this is a workflow that CERF ELN is very well pre-adapted to perform.

In most cases, users typically use CERF in conjunction with the default, industry standard applications on their local computer. An MS Word file, for example, may automatically open in MS Word, whilst your .DNA files may open in, say, snapgene. This workflow illustrates one of the unique advantages of a combined ELN and document management system that uses a desktop application to process your files. CERF carefully logs the interaction between the user and the files stored on the CERF server and displays all activity in the secure audit trail, so that managers are aware of current and past activity and access. In some cases, it may be advantageous to work with highly specialized applications that you've written yourself, designed specifically for performing specialized tasks on data that you stored in CERF. With CERF ELN, users can specify local applications on their computer that they would like to use to check out and edit specific file types. This allows users to optionally checkout files from CERF and open them in a non-default local application.

Lab-Ally has been working with bioinformatics students at the University of Maryland to create a toolbox of small accessory applications that can be used for processing various data files stored in CERF. Each academic term, as part of a capstone bioinformatics class, small groups of students (supervised by Lab-Ally) design, build and test an application of their choice. The application is designed to solve some common bioinformatics problem. An example is described below.

One team of 4 students recently built a GenBank extractor to make parsing genomic data easier and by utilizing this program you get a simplified output from the GenBank files that is readily compatible with CERF and the CERF search feature. The application can be used as a standalone tool or can be used integrated with CERF ELN to allow for superior record keeping, better efficiency and improved organization-wide collaboration.  This parser is designed to extract essential information from GenBank files and output a readable .rtf file.

What Does the GenBank Parser Do?

The parser extracts important data from GenBank files, such as:

  • Accession
  • Organism (Genus species)
  • Taxon data
  • Gene(s)
  • Genetic Sequence

It then organizes this data into an .rtf file, which is easy to read and compatible with most

platforms. Below is an example of what you will find in the output: 


RTF file showing various metadata retrieved from within a sequence file



How to Use the Parser as a Standalone Application


Install the Application:

  • Run the installation file on your computer
Launching the Application:
  • Navigate to the executable file of the program: go to C:\Program Files\GenBankParser and double-click on GenBankParser.exe.
  • A window should pop up with an "Open File" button.
Troubleshooting Display Issues:
  • If the window doesn’t show up properly, try resizing the window. Some users have experienced this issue, and resizing the window can often solve it.
Processing a GenBank File:
  • After clicking the Open File button, choose a .gb or .gbff GenBank file from your system.
  • The application will process the file and save an .rtf file to your desktop.

How to Use the Parser with CERF

  • If you’ve installed the parser, the next step is to configure CERF so it can summon files from the CERF server on demand and utilize the parser tool. Without this step, CERF would simply open GenBank files in whatever the default sequence editing application is on the user's local machine.
  • In CERF, navigate to Tools > Options > Applications.
  • Add the GenBankParser by pointing to the .exe in C:\Program Files\GenBankParser.
  • Set the MIME type to chemical/x-genbank. This helps CERF to understand what types of files you would like to open with the specified applicaiton

This is how Tools > Options > Applications should look once it's set up: 


CERF external application selector


Viewing GenBank Files with CERF:

  • Locate any .gb or .gbff file in CERF’s collections.
  • Right-click the file, select View-in, and choose GenBankParser from the list.
  • The parser will open, allowing you to process the file

CERF modified "View In..."  right-click options



File Output:

  • The application will process the file and save an .rtf file containing the results of the parser analysis to a specified local location.




Pasting as a Relation:

  • The file can then be dragged from the desktop into CERF, and specifically onto the associated file to have it pasted as a relation. This has the advantage that once added to CERF, the .rtf file is immediately indexed for searching so that users with the correct access permissions can search for target text that is located in the .rtf, and once they find THAT file, they can also locate the parent file containing the original raw sequence data.


New in CERF 6, the system will offer the option to automatically associate new files produced by custom applications (containing the results of some analysis) with the parent file containing the raw data. Since CERF offers outstanding version control, it will also be possible to perform these sorts of analysis with different versions of the original data file, associating the results with the correct version of the data in each case, and recording the entire process accurately in the CERF audit trail. We also hope to eventually offer this student-built parser and many other "add-on" tools for use with CERF on our website some time after the release of CERF 6 in 2025.

If you would like to see this tool in action or take a look at the code for the tool that these students built, or if you are a student or developer who would like to work with us to create additional tools like this genetic parser, we would love to hear from you. You can find contact information on the Lab-Ally website.



CERF ELN 5.3 is proving to be a workhorse, and CERF 6 is now well underway.


CERF 5.3 is another big step forward for Lab-Ally and CERF users everywhere. With the challenges scientists and labs everywhere experienced during the pandemic, CERF has become more valuable than ever, since the solution is uniquely pre-adapted for managing work-from-home scientists, technicians and other staff who need to access data and documents securely in ways that allow managers to retain outstanding, top-down awareness of who has been working on what, when they accessed the files, and what they changed.

Some of the new features added since CERF 5.0 include:

file viewer

View files and your folder hierarchy immediately in the center panel or in a new CERF window.

View files from notebooks in separate full-size windows to allow efficient comparison, examination and multitasking.

File viewer supports multiple windows to allow users to quickly compare any number of files.

File viewer is integrated with right-click throughout system to allow easy examination of versions, search results or any resource in CERF.

Better support for files of all types and viewing unsupported files using “official print copy” combined with the file viewer throughout the system.


search

More complete and logical columns to display more information about search results.

More useful search parameters with better organization. New parameters include signature status, activity status and edit status.

Export of results as .csv

More features associated with saved searches to make it easier to use them to generate reports.

Easier to reset searches.

Better options for learning more about characteristics and location of items in search results.


usability

Parity of features inside and outside notebooks

Extended list of template files to allow for creating content more easily in any location, including “standalone” text editor files in file cabinets and more.

Increased flexibility for making notebook entries of various types including plain text and RTF.

Menus, buttons, workflows, speed and stability reworked throughout.

New features to monitor, maintain and fix network connectivity and alert user to network problems.

Ability to instantly export .csv summaries from various locations including file info, audit trail.

Improved performance, especially for mature servers storing many thousands of files, documents and collections.


print to PDF 

Numerous improvements to print to PDF process.


compliance and security

More complete information in audit trail with new section for access logs showing failed login attempts.

Hashing of resources in flight to prevent man-in-the-middle interception.

Improvements to controlled documents allowing for automated access to new items based on user acceptance of terms.

Exporter application for intuitive processing of exported xml files.


Full release notes for CERF 5.3 (current official release) can be tested by anyone - visit our website for instructions:


https://cerf-notebook.com/resources/getting-started-with-cerf-free-trial/


ongoing projects / CERF 6

Update to latest java, tomcat and mysql versions.

External helper apps for performing various actions on files stored within CERF.

Customizable user interface “themes” to allow users to control look and feel.


CERF 6 has of course turned into a massive project as we have jumped from Java version from 1.8 all the way to java 21. To give our customers maximum flexibility, we plan to support both the server and the client software on windows, mac and ubuntu environments with the server hosted either on-premise or on a private AWS instance managed by Lab-Ally. We also now offer customers TWO purchase options: Customers can choose our standard perpetual license, with an annual support plan, or to reduce up-front costs, customers can opt for an all-inclusive annual subscription model.



Tuesday, December 19, 2017

Market research reports - worth the money?

I came across this ELN report that lists "major players in the ELN sector". Hmmmm.... I'm the CEO of one of the companies that report mentions, and I have never heard of them. I see a lot of spam email and online posts like this that loudly pronounce they are from the "#1 Sophisticated Market Research Reports Platform", but apparently "sophisticated" does not involve picking up a phone, calling the number on our website and talking to us about our product, much less actually using it, or asking the vendor anything at all about it. I can't imagine what this "report" could possible have to say about CERF ELN or RSpace ELN, but since they have never asked for product or company information of any kind, I can't imagine it would be anything useful. Furthermore, I think it is probably safe to assume they didn't speak to most of the other vendors they listed in their report either, so my advice would be: Save your money and don't bother with this "report".

I have seen a proliferation of advertisements like this in our sector, and I am surprised that this could possibly be a viable business model, since most of the time the report actually consists of a list of vendors who PAID to have a short description of their product included, and then the "report" is sold for an outrageous fee to gullible execs who don't want to pick up the phone and talk to the ELN vendor directly, which they could easily do for free.

In my opinion, the only way to really find out if an enterprise software product does what you need for your organization is to actually try it in situ, and see for yourself if it solves your problem. Relying on the poorly informed advice of others seems like a lazy strategy, and relying on the paid of advice of anyone who has never even seen the product seems even worse.

Update... I see that another (or, possibly, the same) report is now also being advertised on linkedin here:

https://www.linkedin.com/feed/update/urn:li:activity:6897461786472280064/





It's unfortunate that this article uses an uncredited image stolen from our website at:

http://cerf-notebook.com/articles/what-is-an-eln/.

The image clearly shows a mockup of one of our products, CERF ELN on the faux computer screen.  The lazy use of this ripped-off image doesn't exactly reassure me that the content of the report is going to be professional, reliable and original. There is an argument that says any publicity is good publicity for our product, and maybe I should be flattered that they used this image, but they could have at least asked first. I feel like I owe it to my colleagues in science and data management to make sure they receive accurate information in any industry reports that they purchase. In this case, it seems unlikely that this report would contain anything that I couldn't also get for free from the vendor's website. In the words of a certain well known TV game show host turned part-time politician,  this ELN report kinda seems like it might be "fake news".