Data Provenance can be generally considered as information which is used to reason about a digital object, though the object itself, its provenance contents, granularity and capture techniques may differ depending on domain. Provenance entails all the object’s ancestors, processes and manipulations applied, along with any other information that affected its state. In the Scientific domain, provenance is captured as a dependency graph consisting of nodes representing individual work flow modules and edges showing the data flow. File systems capture provenance as a relation of files, processes, read and write calls. Provenance in the Web consists of meta-data that provides information on entities and processes involved in producing or having any effect on a resource on the web. The database community captures provenance to reason about where a piece of data came from and the process by which it arrived in the database.
In workflow systems, provenance is used in enabling validation, gauging the level of trust, reproducibility, and replication. It is also used for workflow evolution and optimization, workflow debugging, determining authorship, capturing dependencies that may have not been anticipated, and providing a better understanding of data and processes involve. In the context of databases, provenance is used in determining trust and validity of the data involved, solve attribution problems in curated databases, and for view updations. Provenance aware file systems capture causalities that can be used in detecting intrusions, system changes, tracking file modifications and debugging.
Data on the web is not tightly bound to the processes, creators and its providers, hence the need for provenance information to solve issues such as authorship, validity, trust and quality. This is done through using information known about data creators and services, user provided annotations for trust value computation and computing information quality scores by assigning impact values to properties such as accuracy, completeness and timeliness. Provenance has numerous applications in cloud computing where it is used in detecting abnormal behavior of virtual machines or applications, debugging, detect intrusions or security violations, access control and searching. Provenance is not limited to the digital data, industries e.g food and automotive industries need provenance information to prove compliance to set regulations, detect faults in services, products, and for improving productivity and quality.
The usability of provenance information in any domain depends on it being secure. The use of provenance that has been tampered with could lead to serious implications and erroneous judgments. For example, a scientist could forge provenance records and present his fabricated results, data or workflow as genuine. Similarly, a manufacturer can tamper with provenance records to imply that he followed the set rules and regulations avoiding legal implications. In digital forensics provenance plays a major role in proving authenticity of digital objects, where an attacker may gain an unfair advantage if he could tamper with these records. As provenance represents an object’s history, it should not be modifiable. Security of a provenance trace is a challenging problem in any provenance aware system, whereby it affects both underlying data and the trace. Unsecured provenance may not only leak confidential information but also be left vulnerable to attacks such as forgery. Secured provenance traces are vital in determining usability, trust and accountability of any provenance capture system.
E-SPIN in the business to assist enterprise customers for implementing established and emerging technologies solutions, from end to end to point solution for various operation and project requirements. Please feel free to contact E-SPIN for your inquiry and requirement, so we can assist you on the exact requirement in the packaged solutions that you may require for your operation or project needs.