CORDATA: an open data management web application for selecting corrosion inhibitors
Data-driven technologies and machine learning are among the latest developments and most promising approaches in corrosion science to guide the discovery and design of more effective and environmentally friendly corrosion inhibitors and protective coating systems. the environment.3,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25. However, one of the main challenges in applying machine learning to understand and design protection systems is creating the datasets needed to train the predictive models.26.27. Collecting experimental data, as well as managing and curating the data, are some of the most time-consuming tasks in the machine learning workflow. Therefore, a web application such as the one presented here will fulfill two main purposes: (1) it can be used by scientists and engineers working in academia and industry to quickly compare the performance of different corrosion inhibitors and select corrosion specific to the most appropriate condition. inhibitor for every application; and (2) it will provide a framework for organizing hardened datasets for different substrates that will trigger further machine learning and data developments to design corrosion inhibitors.
A general view of the CORDATA application can be seen in Fig. 1 and accessible for free via the following url: https://datacor.shinyapps.io/cordata/. The web application was designed to work on personal computers, tablets, and mobile phones, and includes several different features (Fig. 2), such as: (1) searching for relevant application conditions, such as type of metal and alloy, possible synergistic combination of inhibitors, minimum efficiency, select temperature and pH range, and minimum aggressive salt concentration; (2) quickly verify the structure of the inhibitor and the reference used to obtain its effectiveness in inhibiting corrosion; (3) search for specific corrosion inhibitors via an internal search engine; (4) select and compare other properties and aspects of the data, such as molecular weight, SMILES notation, measurement time, corrosion inhibitor concentration, synergistic inhibitor concentration, experimental methodology, literature reference and the name and institution of the contributor who added each specific data entry; and (5) a user interface with step-by-step instructions is available to allow users to submit additional data, request the data set, or provide feedback. A spreadsheet model file can be downloaded for users to include their own data, while the updated dataset will be available for contributors to use in their own machine learning and data-driven research .
At the time of this publication, nearly five thousand corrosion inhibition efficiencies and nearly four hundred compounds have already been added to the database. The data comes from more than one hundred and twenty publications, mainly for aluminium, copper, magnesium, iron and their main alloys. More specific information about the data included in the database can be found in Table 1.
The total number of efficiency values and compounds is already sufficient to find effective corrosion inhibitor solutions for a large number of application cases and conditions, so it should be immediately useful for scientists and corrosion engineers working on designing more efficient solutions. corrosion protection systems. Nevertheless, the data currently included in the app still represents only a small part of all existing information in the literature. This number will increase over the years as more data is added by authors and by other research groups who see added value to the database, while the web application gains traction within the corrosion science community.