Site icon Arnau Dunjó Workspace

Digitization labotary platform

Context

It’s frequent that in a laboratory there is a great variety of instruments that cover the complexity of the analyzes to be carried out. The big brands (Agilent, Waters, Qiagen …) have developed very expensive programs that allow process the data generated by their instruments. In the best case, they also allow communication between the instruments (as long as they are all of the same brand), so the results obtained in one of them can be used to fill in part of the parameters that the other needs to start his analysis.

Generally, this type of software is very closed and the import / export possibilities are quite limited, so transferring data between different softwares became a problem that generated more inconvenience than advantage. At best, the brands offered complementary applications that allowed the transfer reliably, but they were only valid solutions with their products and quite expensive.

Goals

It was intended to reach the following objectives:

Requirements

Aside from meeting the goals, he had to do it appropriately:

Implementation

Finally, whe chose implement a CDF (Cloudera Data Flow), which was based on the following main components:

On the ERP side, it was decided to receive the information, through a web service that requires the minimum possible data, since it would be used to ingest data from different software, from different companies in the group with different mechanisms and degrees of integration with the ERP:

The majority of the software was configured in the laboratory to export in a very simple report (text file) with the most standard structures possible. The platform monitored specific folders on the server where reports were, looking for new files every few minutes. Depending on the location of these files, the platform expected a specific structure and if it matched it would extract the data and generate the request to the ERP webservice.

To ensure traceability, each line in the file was configured to be treated separately so that if one line / result failed, the rest of the file would continue to be processed. For each line, the request made in SAP and the response obtained were saved on distributed file system. Likewise, in case of error, an e-mail was sent to different distribution lists, depending if it was an “operational” error (poorly formatted file, lack of data …) or a technical error (dropped services, communication error, unexpected errors …). In these last cases, a flow was configured to automatically retry it for a certain time. Also at the machine level, the entire platform was monitored by Nagios.

My contribution

First, I was in charge of looking for platforms that apparently met the established requirements, of doing a small study with the advantages, disadvantages and the cases in which each of them is usually implemented. The recommended solution included a description of the components that could form it and a diagram of the interaction between them.

Once the recommended option was validated with the team, I was in charge of providing a list of possible providers and my opinion (based on the information found online) of each of them.

Later I wrote the RFI, the RFP and the development of the requirements (Functional, Architecture, Security, Training …).

Once the project started, my responsibility consisted of monitoring the implementation and being the nexus with the chosen provider.

Finally, when the project was finishing, I was in charge of writing the Project Administration SOP, designing the tests that would be carried out to validate the implementation, and providing the evidence in the QA department.

Conclusions

Despite the difficulties discussed above, the project was finally a success. Not only was the transfer of the results achieved in compliance with the established requirements, but it also allowed having the platform mounted and validating the capabilities and operation. In this way the bases were established.

Possible improvements

As I mentioned earlier, it is necessary to enter some parameters in the software that manages the instrument so that they later appear in the report that the CDF platform reads. These parameters will later be sent to the ERP so that it knows where to place the data that we are providing. These parameters are obtained from the ERP.

To improve the user experience, a solution could be implemented that would allow an automatic introduction of these parameters in the instrument software. A possible solution would be that in the report that accompanies the sample to be tested, there is a BIDI code that contains these parameters and that can be captured by a reader.

Exit mobile version