Software to revolutionise data preparation
The Data Preparer system minimises the time spent to prepare data for analysis, by offering a radical change in the data preparation paradigm: users can now describe what they need and the system will use all available evidence to clean and integrate their data. They can then refine the result by providing feedback and by revisiting priorities.
"The emphasis is on refining the description of what is needed, rather than on pinning down how it should be produced", explains The Data Value Factory's co-founder Prof. Norman Paton. "The system will use all available evidence to automatically clean and integrate the data, minimising human intervention and enabling data preparation at scale".
Following the innovative paradigm introduced in the Data Preparer system, users provide any number of data sources, a target structure, quality priorities, and optionally example data. The target structure and quality priorities make users' requirements explicit. The example data provides evidence that is used by Data Preparer to clean and integrate the data. The Data Preparer system will then explore how the data sources relate to each other and the target, repair and reformat where necessary, and populate the target from the sources. Data Preparer highlights include:
· Populates a target without requiring the user to handcraft data processing pipelines, edit spreadsheets, or write code scripts or rules to operate on the data.
· Data from several sources is automatically repaired, transformed, and combined.
· Data Preparer can search thousands of ways of combining sources.
· Configuration of data preparation is independent of the number of sources.
· The provenance of values in the end product is captured automatically, so that users get a full view of the automation.
As a result, any user with basic familiarity with the domain(s) in which data wrangling takes place – be it open government data, health/social care, e-commerce, real estate, finance, etc. – can quickly obtain a self-service data preparation result. Existing data preparation solutions involve, without exception, a significant number of fine-grained decisions, typically impeding data preparation at scale. The Manchester-based company expects that in contrast, the users of Data Preparer will be able to unlock significantly more data value, since, data preparation is made possible in cases when the number of available data sources was prohibitively large, or changing schema definitions did not allow conventional manual data preparation.
A free, one month trial of Data Preparer is available for download (https://thedatavaluefactory.com/
The Data Value Factory / Nikolaos Konstantinou