数据容器:产品

产品

产品是将所有 fdi 组件连接在一起的东西。

数据和元数据

../_images/product.png
一个产品具有
  • 零个或多个数据集:定义详细描述的数据实体(例如图像、表格、光谱等)。

  • 随附的元数据——所需的信息,例如

    • 这个产品的分类,

    • 这个产品的创造者,

    • 产品的创建时间

    • 数据反映了什么?(其预期使用范围)

    • 等等;

    • 该特定产品类型的可能附加的特定元数据。

  • 这个产品的历史:这些数据是如何创建的。

  • 构成这个产品上下文的相关产品的参考

历史

历史记录是一种轻量级的机制,用于记录该产品的起源或对该产品所做的更改。轻量级意味着,产品数据本身不记录更改,但外部各方可以将附加信息附加到反映更改的产品。

产品历史界面的唯一目的是允许流水线任务(由流水线框架定义)记录它们为生成和/或修改产品所做的工作。

可序列化

为了在异构节点之间跨网络传输数据,数据需要可序列化。JSON 格式被用于传输序列化数据,因为它被广泛采用、工具的可用性,并易于在 Python 内使用。

产品定义方法论

数据产品几乎总是按照继承顺序进行分类,反映了数据模型的底层关系。很多产品在对比元数据和数据集时发现有继承关系。因此,这里选择了面向对象的方法来分析和定义产品的结构、功能和接口。

首先以YAML格式指定内置参数,适合人和机器阅读。一个辅助实用程序 yaml2python,用于生成包含内置插件的产品类模块的测试就绪 Python 代码。

YAML 架构允许子产品从一个或多个父产品继承元数据定义。也允许覆盖。

基础产品

定义文档 BaseProduct.yml

name: BaseProduct
description: FDI base class data model
parents:
  -
schema: '1.6'
metadata:
    description:
        id_zh_cn: 描述
        data_type: string
        description: Description of this product
        description_zh_cn: 对本产品的描述。
        default: UNKNOWN
        valid: ''
        typecode: B
    type:
        id_zh_cn: 产品类型
        data_type: string
        description: Product Type identification. Name of class or CARD.
        description_zh_cn: 产品类型。完整Python类名或卡片名。
        default: BaseProduct
        valid: ''
        typecode: B
    level:
        id_zh_cn: 产品xx
        data_type: string
        description: Product level.
        description_zh_cn: 产品xx
        default: ALL
        valid: ''
        typecode: B
    creator:
        id_zh_cn: 本产品生成者
        data_type: string
        description: Generator of this product.
        description_zh_cn: 本产品生成方的标识,例如可以是单位、组织、姓名、软件、或特别算法等。
        default: UNKNOWN
        valid: ''
        typecode: B
    creationDate:
        id_zh_cn: 产品生成时间
        fits_keyword: DATE
        data_type: finetime
        description: Creation date of this product
        description_zh_cn: 本产品生成时间
        default: 0
        valid: ''
        typecode:
    rootCause:
        id_zh_cn: 数据来源
        data_type: string
        description: Reason of this run of pipeline.
        description_zh_cn: 数据来源(此例来自鉴定件热真空罐)
        default: UNKNOWN
        valid: ''
        typecode: B
    version:
        id_zh_cn: 版本
        data_type: string
        description: Version of product
        description_zh_cn: 产品版本
        default: '0.8'
        valid: ''
        typecode: B
    FORMATV:
        id_zh_cn: 格式版本
        data_type: string
        description: Version of product schema and revision
        description_zh_cn: 产品格式版本
        default: '1.6.0.10'
        valid: ''
        typecode: B
datasets:

序言键值对提供有关此定义的信息:

name

这个产品的

description

产品信息

parents

子产品继承母产品的元数据

level

适用等级

schema

此 YAML 文档的格式版本

从创建过程开始,每个产品都需要携带以下关于自身的元数据条目,

description

(如果不是英语,也可用母语。)

type

在软件或业务领域

version

相同格式的产品必须进行版本控制、配置控制,并准备好处理输入、算法、软件和管道之间的版本差异。

FORMATV

带有架构信息的本文档版本,例如 1.4.1.2

creator, rootCause, creationDate

谁、为什么、何时、何地

参数如下表所示。


╒══════════════╤════════════════════╤══════╤══════════╤═══════╤════════════════════╤══════╤═══════════════════════════╕
│ name         │ value              │ unit │ type     │ valid │ default            │ code │ description               │
╞══════════════╪════════════════════╪══════╪══════════╪═══════╪════════════════════╪══════╪═══════════════════════════╡
│ description  │ UNKNOWN            │      │ string   │ None  │ UNKNOWN            │ B    │ Description of this produ │
│              │                    │      │          │       │                    │      │ ct                        │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ type         │ BaseProduct        │      │ string   │ None  │ BaseProduct        │ B    │ Product Type identificati │
│              │                    │      │          │       │                    │      │ on. Name of class or CARD │
│              │                    │      │          │       │                    │      │ .                         │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ level        │ ALL                │      │ string   │ None  │ ALL                │ B    │ Product level.            │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ creator      │ UNKNOWN            │      │ string   │ None  │ UNKNOWN            │ B    │ Generator of this product │
│              │                    │      │          │       │                    │      │ .                         │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ creationDate │ 1958-01-01T00:00:0 │      │ finetime │ None  │ 1958-01-01T00:00:0 │ Q    │ Creation date of this pro │
│              │ 0.000000           │      │          │       │ 0.000000           │      │ duct                      │
│              │ 0                  │      │          │       │ 0                  │      │                           │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ rootCause    │ UNKNOWN            │      │ string   │ None  │ UNKNOWN            │ B    │ Reason of this run of pip │
│              │                    │      │          │       │                    │      │ eline.                    │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ version      │ 0.8                │      │ string   │ None  │ 0.8                │ B    │ Version of product        │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ FORMATV      │ 1.6.0.10           │      │ string   │ None  │ 1.6.0.10           │ B    │ Version of product schema │
│              │                    │      │          │       │                    │      │  and revision             │
├──────────────┼────────────────────┼──────┼──────────┼───────┼────────────────────┼──────┼───────────────────────────┤
│ listeners    │ <No listener>      │      │          │       │                    │      │                           │
╘══════════════╧════════════════════╧══════╧══════════╧═══════╧════════════════════╧══════╧═══════════════════════════╛

示例(来自 快速开始 页面):


>>> # Creation:
... x = Product(description="product example with several datasets",
...             instrument="Crystal-Ball", modelName="Mk II")
... x.meta['description'].value  # == "product example with several datasets"
'product example with several datasets'
>>> # The 'instrument' and 'modelName' built-in properties show the
... # origin of FDI -- processing data from scientific instruments.
... x.instrument  # == "Crystal-Ball"
'Crystal-Ball'
>>> # ways to add datasets
... i0 = 6
... i1 = [[1, 2, 3], [4, 5, i0], [7, 8, 9]]
... i2 = 'ev'                 # unit
... i3 = 'image1'     # description
... image = ArrayDataset(data=i1, unit=i2, description=i3)
... # put the dataset into the product
... x["RawImage"] = image
... # take the data out of the product
... x["RawImage"].data  # == [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> # Another syntax to put dataset into a product: set(name, dataset)
... # Different but same function as above.
... # Here no unit or description is given when making ArrayDataset
... x.set('QualityImage', ArrayDataset(
...     [[0.1, 0.5, 0.7], [4e3, 6e7, 8], [-2, 0, 3.1]]))
... x["QualityImage"].unit  # is None
>>> # add another tabledataset
... s1 = [('col1', [1, 4.4, 5.4E3], 'eV'),
...       ('col2', [0, 43.2, 2E3], 'cnt')]
... x["Spectrum"] = TableDataset(data=s1)
... # See the numer and types of existing datasets in the product
... [type(d) for d in x.values()]
[fdi.dataset.arraydataset.ArrayDataset,
 fdi.dataset.arraydataset.ArrayDataset,
 fdi.dataset.tabledataset.TableDataset]
>>> # mandatory properties are also in metadata
... # test mandatory BaseProduct properties that are also metadata
... a0 = "Me, myself and I"
... x.creator = a0
... x.creator   # == a0
'Me, myself and I'
>>> # metada by the same name is also set
... x.meta["creator"].value   # == a0
'Me, myself and I'
>>> # change the metadata
... a1 = "or else"
... x.meta["creator"] = Parameter(a1)
... # metada changed
... x.meta["creator"].value   # == a1
'or else'
>>> # so was the property
... x.creator   # == a1
'or else'
>>> # load some metadata
... m = x.meta
... m['ddetector'] = v['d']
>>> print(x.toString())
=== Product (product example with several datasets) ===
meta= {
============  ====================  ======  ========  ====================  =================  ======  =====================
name          value                 unit    type      valid                 default            code    description
============  ====================  ======  ========  ====================  =================  ======  =====================
description   product example with          string    None                  UNKNOWN            B       Description of this p
               several datasets                                                                        roduct
type          Product                       string    None                  Product            B       Product Type identifi
                                                                                                       cation. Name of class
                                                                                                        or CARD.
level         ALL                           string    None                  ALL                B       Product level.
creator       or else                       string    None                  None                       UNKNOWN
creationDate  1958-01-01T00:00:00.          finetime  None                  1958-01-01T00:00:  Q       Creation date of this
              000000                                                        00.000000                   product
              0                                                             0
rootCause     UNKNOWN                       string    None                  UNKNOWN            B       Reason of this run of
                                                                                                        pipeline.
version       0.8                           string    None                  0.8                B       Version of product
FORMATV       1.6.0.10                      string    None                  1.6.0.10           B       Version of product sc
                                                                                                       hema and revision
startDate     1958-01-01T00:00:00.          finetime  None                  1958-01-01T00:00:  Q       Nominal start time  o
              000000                                                        00.000000                  f this product.
              0                                                             0
endDate       1958-01-01T00:00:00.          finetime  None                  1958-01-01T00:00:  Q       Nominal end time  of
              000000                                                        00.000000                  this product.
              0                                                             0
instrument    Crystal-Ball                  string    None                  UNKNOWN            B       Instrument that gener
                                                                                                       ated data of this pro
                                                                                                       duct
modelName     Mk II                         string    None                  UNKNOWN            B       Model name of the ins
                                                                                                       trument of this produ
                                                                                                       ct
mission       _AGS                          string    None                  _AGS               B       Name of the mission.
ddetector     port_1 (0b01)         None    integer   11000000 0b01: port_  None               None    valid rules described
              stand_by (0b0)                          1                                                 with binary masks
              normal (0b1)                            11000000 0b10: port_
              Invalid                                 2
                                                      11000000 0b11: port
                                                      closed
                                                      00100000 0b0: stand_
                                                      by
                                                      00100000 0b1: main
                                                      00010000 0b0: error
                                                      00010000 0b1: normal
                                                      00001111 0b0000: res
                                                      erved
============  ====================  ======  ========  ====================  =================  ======  =====================
MetaData-listeners = ListnerSet{}},
history= {},
listeners= {ListnerSet{}}

=== History (UNKNOWN) ===
PARAM_HISTORY= {''},
TASK_HISTORY= {''},
meta= {(No Parameter.) MetaData-listeners = ListnerSet{}}

History-datasets =
<ODict >

Product-datasets =
<ODict "RawImage":
=== ArrayDataset (image1) ===
meta= {
===========  =======  ======  ======  =======  =========  ======  =====================
name         value    unit    type    valid    default    code    description
===========  =======  ======  ======  =======  =========  ======  =====================
shape        (3, 3)           tuple   None     ()                 Number of elements in
                                                                   each dimension. Quic
                                                                  k changers to the rig
                                                                  ht.
description  image1           string  None     UNKNOWN    B       Description of this d
                                                                  ataset
unit         ev               string  None     None       B       Unit of every element
                                                                  .
typecode     UNKNOWN          string  None     UNKNOWN    B       Python internal stora
                                                                  ge code.
version      0.1              string  None     0.1        B       Version of dataset
FORMATV      1.6.0.1          string  None     1.6.0.1    B       Version of dataset sc
                                                                  hema and revision
===========  =======  ======  ======  =======  =========  ======  =====================
MetaData-listeners = ListnerSet{}}
ArrayDataset-dataset =
1  2  3
4  5  6
7  8  9


"QualityImage":
=== ArrayDataset (UNKNOWN) ===
meta= {
===========  =======  ======  ======  =======  =========  ======  =====================
name         value    unit    type    valid    default    code    description
===========  =======  ======  ======  =======  =========  ======  =====================
shape        (3, 3)           tuple   None     ()                 Number of elements in
                                                                   each dimension. Quic
                                                                  k changers to the rig
                                                                  ht.
description  UNKNOWN          string  None     UNKNOWN    B       Description of this d
                                                                  ataset
unit         None             string  None     None       B       Unit of every element
                                                                  .
typecode     UNKNOWN          string  None     UNKNOWN    B       Python internal stora
                                                                  ge code.
version      0.1              string  None     0.1        B       Version of dataset
FORMATV      1.6.0.1          string  None     1.6.0.1    B       Version of dataset sc
                                                                  hema and revision
===========  =======  ======  ======  =======  =========  ======  =====================
MetaData-listeners = ListnerSet{}}
ArrayDataset-dataset =
   0.1  0.5    0.7
4000    6e+07  8
  -2    0      3.1


"Spectrum":
=== TableDataset (UNKNOWN) ===
meta= {
===========  =======  ======  ======  =======  =========  ======  =====================
name         value    unit    type    valid    default    code    description
===========  =======  ======  ======  =======  =========  ======  =====================
description  UNKNOWN          string  None     UNKNOWN    B       Description of this d
                                                                  ataset
version      0.1              string  None     0.1        B       Version of dataset
FORMATV      1.6.0.1          string  None     1.6.0.1    B       Version of dataset sc
                                                                  hema and revision
===========  =======  ======  ======  =======  =========  ======  =====================
MetaData-listeners = ListnerSet{}}
TableDataset-dataset =
  col1     col2
  (eV)    (cnt)
------  -------
   1        0
   4.4     43.2
5400     2000
>>>