The digital lab; trends in ELN, LIMS and other research software

Discussions of scientific research data management, bioinformatics, trends in scientific software, 21cfr11 compliance and the role of the Electronic Lab Notebooks (ELN) and Laboratory Information management systems (LIMS) in modern research workflows and collaborations. If you are you a bioinformatics specialist or an ELN / LIMS vendor, user, integrator or developer, and you want to share your experiences with others, please feel contact me via the email link in my profile below.

Showing posts with label CERF. Show all posts

Tuesday, November 19, 2024

Coming in CERF 6: Improved support for using custom apps to perform mission-critical tasks and analyses on files stored in CERF.

Imagine you are a research organization that works with data files in some specialized format. A genetics lab working with GenBank .GBK or snapgene .DNA sequence files would be a good example. Now imagine your software engineers have written a custom app designed to perform some calculation or processing task on your data files, with the result or summary output to a new file.

Let's further imagine, that as a data manager, you need to have a good record of when any analyses where performed, who performed the analysis and precisely where the results of the analysis are stored. As it happens, this is a workflow that CERF ELN is very well pre-adapted to perform.

In most cases, users typically use CERF in conjunction with the default, industry standard applications on their local computer. An MS Word file, for example, may automatically open in MS Word, whilst your .DNA files may open in, say, snapgene. This workflow illustrates one of the unique advantages of a combined ELN and document management system that uses a desktop application to process your files. CERF carefully logs the interaction between the user and the files stored on the CERF server and displays all activity in the secure audit trail, so that managers are aware of current and past activity and access. In some cases, it may be advantageous to work with highly specialized applications that you've written yourself, designed specifically for performing specialized tasks on data that you stored in CERF. With CERF ELN, users can specify local applications on their computer that they would like to use to check out and edit specific file types. This allows users to optionally checkout files from CERF and open them in a non-default local application.

Lab-Ally has been working with bioinformatics students at the University of Maryland to create a toolbox of small accessory applications that can be used for processing various data files stored in CERF. Each academic term, as part of a capstone bioinformatics class, small groups of students (supervised by Lab-Ally) design, build and test an application of their choice. The application is designed to solve some common bioinformatics problem. An example is described below.

One team of 4 students recently built a GenBank extractor to make parsing genomic data easier and by utilizing this program you get a simplified output from the GenBank files that is readily compatible with CERF and the CERF search feature. The application can be used as a standalone tool or can be used integrated with CERF ELN to allow for superior record keeping, better efficiency and improved organization-wide collaboration. This parser is designed to extract essential information from GenBank files and output a readable .rtf file.

What Does the GenBank Parser Do?

The parser extracts important data from GenBank files, such as:

Accession
Organism (Genus species)
Taxon data
Gene(s)
Genetic Sequence

It then organizes this data into an .rtf file, which is easy to read and compatible with most

platforms. Below is an example of what you will find in the output:

RTF file showing various metadata retrieved from within a sequence file

How to Use the Parser as a Standalone Application

Install the Application:

Run the installation file on your computer

Launching the Application:

Navigate to the executable file of the program: go to C:\Program Files\GenBankParser and double-click on GenBankParser.exe.
A window should pop up with an "Open File" button.

Troubleshooting Display Issues:

If the window doesn’t show up properly, try resizing the window. Some users have experienced this issue, and resizing the window can often solve it.

Processing a GenBank File:

After clicking the Open File button, choose a .gb or .gbff GenBank file from your system.
The application will process the file and save an .rtf file to your desktop.

How to Use the Parser with CERF

If you’ve installed the parser, the next step is to configure CERF so it can summon files from the CERF server on demand and utilize the parser tool. Without this step, CERF would simply open GenBank files in whatever the default sequence editing application is on the user's local machine.
In CERF, navigate to Tools > Options > Applications.
Add the GenBankParser by pointing to the .exe in C:\Program Files\GenBankParser.
Set the MIME type to chemical/x-genbank. This helps CERF to understand what types of files you would like to open with the specified applicaiton

This is how Tools > Options > Applications should look once it's set up:

Viewing GenBank Files with CERF:

Locate any .gb or .gbff file in CERF’s collections.
Right-click the file, select View-in, and choose GenBankParser from the list.
The parser will open, allowing you to process the file

CERF modified "View In..." right-click options

File Output:

The application will process the file and save an .rtf file containing the results of the parser analysis to a specified local location.

Pasting as a Relation:

The file can then be dragged from the desktop into CERF, and specifically onto the associated file to have it pasted as a relation. This has the advantage that once added to CERF, the .rtf file is immediately indexed for searching so that users with the correct access permissions can search for target text that is located in the .rtf, and once they find THAT file, they can also locate the parent file containing the original raw sequence data.

New in CERF 6, the system will offer the option to automatically associate new files produced by custom applications (containing the results of some analysis) with the parent file containing the raw data. Since CERF offers outstanding version control, it will also be possible to perform these sorts of analysis with different versions of the original data file, associating the results with the correct version of the data in each case, and recording the entire process accurately in the CERF audit trail. We also hope to eventually offer this student-built parser and many other "add-on" tools for use with CERF on our website some time after the release of CERF 6 in 2025.

If you would like to see this tool in action or take a look at the code for the tool that these students built, or if you are a student or developer who would like to work with us to create additional tools like this genetic parser, we would love to hear from you. You can find contact information on the Lab-Ally website.

CERF ELN 5.3 is proving to be a workhorse, and CERF 6 is now well underway.

CERF 5.3 is another big step forward for Lab-Ally and CERF users everywhere. With the challenges scientists and labs everywhere experienced during the pandemic, CERF has become more valuable than ever, since the solution is uniquely pre-adapted for managing work-from-home scientists, technicians and other staff who need to access data and documents securely in ways that allow managers to retain outstanding, top-down awareness of who has been working on what, when they accessed the files, and what they changed.

Some of the new features added since CERF 5.0 include:

file viewer

View files and your folder hierarchy immediately in the center panel or in a new CERF window.

View files from notebooks in separate full-size windows to allow efficient comparison, examination and multitasking.

File viewer supports multiple windows to allow users to quickly compare any number of files.

File viewer is integrated with right-click throughout system to allow easy examination of versions, search results or any resource in CERF.

Better support for files of all types and viewing unsupported files using “official print copy” combined with the file viewer throughout the system.

search

More complete and logical columns to display more information about search results.

More useful search parameters with better organization. New parameters include signature status, activity status and edit status.

Export of results as .csv

More features associated with saved searches to make it easier to use them to generate reports.

Easier to reset searches.

Better options for learning more about characteristics and location of items in search results.

usability

Parity of features inside and outside notebooks

Extended list of template files to allow for creating content more easily in any location, including “standalone” text editor files in file cabinets and more.

Increased flexibility for making notebook entries of various types including plain text and RTF.

Menus, buttons, workflows, speed and stability reworked throughout.

New features to monitor, maintain and fix network connectivity and alert user to network problems.

Ability to instantly export .csv summaries from various locations including file info, audit trail.

Improved performance, especially for mature servers storing many thousands of files, documents and collections.

print to PDF

Numerous improvements to print to PDF process.

compliance and security

More complete information in audit trail with new section for access logs showing failed login attempts.

Hashing of resources in flight to prevent man-in-the-middle interception.

Improvements to controlled documents allowing for automated access to new items based on user acceptance of terms.

Exporter application for intuitive processing of exported xml files.

Full release notes for CERF 5.3 (current official release) can be tested by anyone - visit our website for instructions:

https://cerf-notebook.com/resources/getting-started-with-cerf-free-trial/

ongoing projects / CERF 6

Update to latest java, tomcat and mysql versions.

External helper apps for performing various actions on files stored within CERF.

Customizable user interface “themes” to allow users to control look and feel.

CERF 6 has of course turned into a massive project as we have jumped from Java version from 1.8 all the way to java 21. To give our customers maximum flexibility, we plan to support both the server and the client software on windows, mac and ubuntu environments with the server hosted either on-premise or on a private AWS instance managed by Lab-Ally. We also now offer customers TWO purchase options: Customers can choose our standard perpetual license, with an annual support plan, or to reduce up-front costs, customers can opt for an all-inclusive annual subscription model.

Monday, January 23, 2017

CERF ELN Version 5.0 is here!

Well folks, CERF ELN 5.0 is finally ready for release.

Read the press release here.

CERF 5.0 is the result of an enormous amount of hard work by many people and is the culmination of almost 2 years of work. This is the first version of CERF created by its new producer, Lab-Ally.

Job one was locating and moving the source code to an all-new, modern, agile and fully integrated development, build and support platform. This new platform will allow us to move forward more quickly after this initial release (or more accurately "re-boot"). Our next engineering task involved updating the core components that make the whole shebang work. JAVA, Tomcat , Open Office, MySQL and modern SSL certificates, plus dozens of other components and libraries that all needed to be updated.

Next came the hard part, refactoring all the code to get the updated components and build environment to work together and spit out a functioning product. This part took us months, but when the newest version was finally birthed, we liked what we saw.

Then we went on a graphics spree, refreshing and redesigning almost all of the icons and buttons, and adding support for GUI features (like Mac's full screen mode) that didn't exist when CERF was first created.

As we worked on the product and started using it for data management within our own organization, obvious priorities for new features, improvements and refinements started to emerge, as did the need to "comment out" certain older, buggy or deprecated features that we plan to circle back to later. The rate at which the team started to brainstorm new ideas began to accelerate. New feature requests poured into our request system (JIRA) and before long, an ambitious roadmap that stretches years into the future started to emerge.

CERF 5 focusses on shoring up the product's most powerful features: semantic metadata and semantic search, round trip editing, flexible import and export of data and the use of notes, tags and configurable ontologies to add meaning to your files. Several new search parameters were added and the default search parameter list was re-designed to make it easier to use. A new export tool was created and a new version of the Automaton (formerly the "Automation Client" was built. Lab-Ally has also redoubled efforts to prioritize product quality, speed and and stability and is also putting much more emphasis on clear and complete documentation as well as compliance with industry-standard security tools like MS and Apple code signing, (which previously had been largely ignored). looking into the future, CERF will become increasingly focussed on the needs of GLP or "spirit of GLP" labs, with full support for things like ALCOA and related documentation principles.

The last piece in the puzzle was tracking down the code for the iPad App and redesigning it to work with modern iPads and to comply with Apple's more stringent code and security standards. Honestly The iPad app was never much more than a prototype when it was first released in 2010 or 2011, so it took quite a lot of effort to finally get the new version to the point where we were comfortable releasing it. We called it iCERF and it's available on the itunes store now.

We are happy with the results and we think CERF is well positioned to take advantage of a growing demand for a full-featured Electronic Lab Notebook and 21CFR11 compliant document management system that can be installed on-site. The cloud may be popular for many sorts of data storage, but when it comes to mission critical, irreplaceable intellectual property, the smart organizations are getting tired of huge corporations holding their data hostage on the cloud where we all know that the US and Chinese governments will probably rummage through it any time they like.

If you want a free demo of this newest version, please contact lab-ally.

Monday, July 11, 2011

It's all just semantics.

The problem with computers is that they are, as we used to say back in the UK, "all face and no trousers". Computers can hold huge amounts of data and can quickly search for target character strings or numeric values, but ultimately they have no idea what any of the data actually mean. This is problematic when dealing with a data management program that centralizes information pouring in from many scientists. If I want to find something that I know I at one time contributed to an ever-increasing mountain of data, I can search for a word or value that I know I included when I created it. However, If I want to find something someone else created, then I have a problem because I don’t necessarily know any of the words or numeric values that they included and they may no longer be available to ask. Additionally, I won’t necessarily recognize search results as useful based on a file name, image thumbnail or some other preview that is the result of a full text search. When a scientist asks another scientist an ambiguous question, we, as humans, can respond in a uniquely human way by saying something like “what do you mean?” Poor, dumb computers on the other hand can never know “what you mean”, because they don’t understand meaning.

A good ELN system finds ways to associate rich meaning with data to make it easier to find information and build upon past laboratory research. One way to do this is to associate metadata with raw data files, which help to identify the data’s origin, meaning and relationships. Some types of metadata are common and well known – keywords for example, or the “tags” that so many of us use to identify our friends in facebook pictures. Savvy organizations understand the importance of metadata and may insist that information such as sample numbers, project IDs and grant numbers be associated with all data to make it easier to gather and find later, but this kind of metadata still assumes that users know and follow established conventions and naming schemes. A good ELN can go one step further. Using technologies such as OWL and RDF, a good ELN can associate semantic metadata with objects using pre-defined ontologies that can anticipate how humans make meaning of data. Think of an ontology as a set of related, hierarchic terms with increasingly specific meanings. For example, the term “Autoimmune Disease” might can be subdivided into a full list of 100 different examples (Addison’s disease, Alopecia, Arthritis, Allergies etc. etc.) and some of these second level terms might be subdivided into third level terms (Allergies include Hay Fever, Penicillin Allergy, etc). If you perform an experiment related to Hay Fever, and you associate the semantic metadata label “Hay Fever” with that experiment, then later a colleague searches for “Autoimmune Disease”, a good ELN is smart enough to include the Hay Fever experiment in the search results because it “knows” that Hay Fever is a type of Autoimmune Disease, even thought the experiment does not anywhere contain that exact phrase.

By using industry standard or carefully constructed custom ontologies that make sense for a particular organization, downstream searching and gathering of knowledge assets can be greatly facilitated because a good ELN understands what kinds of resources you are looking for even if you do not know anything about the specific text or content of those resources. a good ELN can also automate the initial association of metadata with certain objects so that the scientist can spend less time manually categorizing their data, and more time performing research. In sense, a good ELN can be trained to understand what a particular file “means”, bringing the ELN one step closer to the goal of behaving as though it were a real human lab assistant, albeit one that never requests a vacation or asks for a pay raise. There's really only one ELN on the market that leverages modern semantic technologies, and that's CERF. learn more at https://cerf-notebook.com