Flexible Dataset Integrator (fdi)
FDI, known as SPDC before, is written in Python for integrating different types of data, and letting the integrated product take care of inter-platform compatibility, serialisation, persistence, and data object referencing that enables lazy-loading.
Features
With FDI one can pack data of different format into modular Data Products, together with annotation (description and units) and meta data (data about data). One can make arrays or tables of Products using basic data structures such as sets, sequences (Python list
), mappings (Python dict
), or custom-made classes. FDI accomodates nested and highly complex structures.
Access APIs of the components of ‘FDIs’ are convenient, making it easier for scripting and data mining directly ‘on FDIs’.
All levels of FDI Products and their component (datasets or metadata) are portable (serializable) in human-friendly standard format (JSON implemented), allowing machine data processors on different platforms to parse, access internal components, or re-construct “an FDI”. Even a human with a web browser can understand the data.
The toString()
method of major containers classes outputs nicely formated text representation of complex data to help converting FDI to ASCII.
Most FDI Products and components implement event sender and listener interfaces, allowing scalable data-driven processing pipelines and visualizers of live data to be constructed.
FDI storage ‘pools’ (file based and memory based) are provided as references for 1) queryable data storage and, 2) for all persistent data to be referenced to with URNs (Universal Resource Names).
FDI provides Context type of product so that references of other products can become components of a Context, enabling encapsulation of rich, deep, sophisticated, and accessible contextual data, yet remain light weight.
For data processors, an HTML server with RESTful APIs is implemented (named Processing Node Server, PNS) to interface data processing modules. PNS is especially suitable for Docker containers in pipelines mixing legacy software or software of incompatible environments to form an integral data processing pipeline.
This package attempts to meet scientific observation and data processing requirements, and is inspired by data models of, and designs APIs as compatible as possible with, European Space Agency’s Interactive Analysis package of Herschel Common Science System (written in Java, and in Jython for scripting).
FDI Python packages
API Document
- API Reference
- fdi.dataset package
- Submodules
- fdi.dataset.abstractcomposite module
- fdi.dataset.annotatable module
- fdi.dataset.attributable module
- fdi.dataset.baseproduct module
- fdi.dataset.classes module
- fdi.dataset.collectionsMockUp module
- fdi.dataset.composite module
- fdi.dataset.copyable module
- fdi.dataset.dataset module
- fdi.dataset.datatypes module
- fdi.dataset.datawrapper module
- fdi.dataset.deserialize module
- fdi.dataset.eq module
- fdi.dataset.finetime module
- fdi.dataset.listener module
- fdi.dataset.metadata module
- fdi.dataset.metadataholder module
- fdi.dataset.ndprint module
- fdi.dataset.odict module
- fdi.dataset.product module
- fdi.dataset.quantifiable module
- fdi.dataset.serializable module
- fdi.dataset.yaml2python module
- fdi.pal package
- Subpackages
- Submodules
- fdi.pal.common module
- fdi.pal.comparable module
- fdi.pal.context module
- fdi.pal.definable module
- fdi.pal.httpclientpool module
- fdi.pal.localpool module
- fdi.pal.mempool module
- fdi.pal.pnspoolserver module
- fdi.pal.poolmanager module
- fdi.pal.productpool module
- fdi.pal.productref module
- fdi.pal.productstorage module
- fdi.pal.query module
- fdi.pal.runpnsserver module
- fdi.pal.taggable module
- fdi.pal.urn module
- fdi.pal.versionable module
- fdi.pns package
- fdi.utils package
- Diagrams
- Indices and tables
- fdi.dataset package