苏州的一天(2024.11.23)-学习札记⑦ 数据质量

Data Governance, Data Integrity, and Data Quality: What’s theConnection?
Published on: November 7, 2024
Chris Burgess
感谢GxP视界的投喂。感谢Chris Burgess大神的演绎。
Abstract  摘要
Nomenclature is important. Data governance, data integrity, and dataquality are all widely used terms, but what do they actually mean and how arethey connected? The purpose of this article is to provide a structured modelfor these terms with their definitions and their relationships in the contextof analysis and testing within a pharmaceutical quality system.
The crucial concept is that data quality, including data integrity, isonly attainable via data governance, as will be illustrated in the proposedmodel.
Introduction  简介
In a regulatory context, the establishment of measured values andsubsequent reportable results of a predefined quality is an essential activity.Reportable results are then compared with predetermined acceptance criteriaand/or standards and specifications. The processes, mechanisms, and controlsystems necessary to establish measurement values and reportable results of adefined quality are interlinked. These interlinkages may include metrological,procedural, or organizational elements.
The overall proposed model is built up using building blocks akin tothe construction of Lego brick models. This approach has been used in previouspapers concerning error budgets in measurement uncertainty and Monte Carlo Simulation, and data quality within alifecycle approach (1–2).
An analytical testing flow diagram, shown in Figure 1, gives ahigh-level idea of the traceability and subsequent interactions from themarketing authorization (new drug application, NDA, or abbreviated new drugapplication, ANDA) to the reportable result and data quality.
图 1 所示的分析测试流程图简要介绍了可追溯性以及从上市许可(新药申请、NDA 或简化新药申请、ANDA)到可报告结果和数据质量的后续交互。

Figure 1: Ahigh-level process flow indicating some of the elements of data governance,data quality, and data integrity, as well as the quality management system(QMS).
图 1:一个概览流程图,揭示了数据治理、数据质量和数据完整性以及质量管理体系(QMS)的一些要素。

This process flow has been translated intothe Lego brick model shown in Figure 2. This model is based upon adata quality within a lifecycle approach model which has been modified andextended, as will be described in some detail (2).
该流程已转换为图 2 所示的乐高积木模型。该模型基于生命周期方法模型中的数据质量,该模型已被修改和扩展,并将详细描述(2)。

Figure 2: Data governance model modifiedand extended from the data quality within a lifecycle approach model (taken fromreference 2, Figure 15).
图 2:从生命周期方法的数据质量模型中进行修改和扩展的数据治理模型(取自参考文献 2,图 15进行了修改和扩展)。 注:分析程序里的验证与确认应该是Validation & verification。

The crucial concept is that data quality isonly attainable via data governance.

The data governance and data qualityLego brick model

Data quality is not an accident but aproduct of design. It is a combination of data integrity and the functionalityand control under the Pharmaceutical Quality System, usually termed the qualitymanagement system (QMS). These aspects are underpinned by data governance andqualified information technology (IT) infrastructure services. This isessentially a sandwich structure in which the “filling” is provided by metrological integrity, analytical procedureintegrity, and quality oversight. These “fillings” are described below.
数据质量不是凭空而来,而是设计的产物。它是数据完整性与药品质量体系(通常称为质量管理体系 (QMS))下的功能和控制的组合。这些方面以数据治理和已确认的信息技术 (IT)基础设施服务为基础。这本质上是一种夹层结构,其中“积木块”由计量完整性、分析程序完整性和质量监督提供。这些 “积木块” 如下所述。

All the elements shown in Figure 2 aresubject to risk assessments and risk management (3), which will not bediscussed here.
图 2 中所示的所有要素都受风险评估和风险管理 (3) 的约束,这里不再讨论。
The qualified IT infrastructure serviceswith cybersecurity and access control, as seen above, also will not be coveredfurther in this article.
如上所述,具有网络安全和访问控制功能的已确认 IT 基础设施服务也将在本文中不再进一步介绍。
It is, however, necessary to discuss thekey elements of data integrity and the Pharmaceutical Quality System which,when combined, generate data quality on the foundation of data governance.
Data governance is the totality ofarrangements to ensure that data, irrespective of the format in which they aregenerated, are recorded, processed, retained, and used to ensure a complete,consistent, and accurate record throughout the data lifecycle (4).

Data integrity  
The purpose of an analytical procedure isto provide a reportable result of the analytical characteristic or qualityattribute being determined. Analysis and testing require a measurement system,and a procedure for its application to a sample.
Data integrity is underpinned by the firstbrick, the metrological integrity of the instrument or system’s operational performance, with demonstrable assurance that it is “fit for intended use” within a specificanalytical procedure, which is “fit for intendedpurpose” over the data lifecycle.

Metrological integrity
Analysis and testing usually involve theuse of an apparatus, analytical instrument, or system to make a measurement.Therefore, establishment of “fitness for intended use” for any apparatus, analytical instruments, or systems used inanalysis and testing is necessary to ensure metrological integrity over theoperational ranges required.
Therefore, it is essential to establish “fitness for intended use” before theanalytical procedure is performed. The main resource for instrument and systemrequirements are the specific monographs and general chapters in thepharmacopeias. In particular, United States Pharmacopeia (USP) has a uniquegeneral chapter on the lifecycle processes and requirements for ensuring thatany apparatus, analytical instrument, or system is “fitfor intended use,” as seen in Figure 3 (5).
因此,在执行分析程序之前确定“预期用途的适用性”至关重要。仪器和系统要求的主要资源是药典中的具体各论和通则。特别是,美国药典(USP) 有一个独特的通则,介绍了生命周期流程和要求,以确保任何仪器、分析仪器或系统“适合预期用途”,如图 3 (5) 所示。

Figure 3: Data quality outline for ananalysis and testing quality control (QC) model using USP references.
图 3:索引至 USP的分析和测试质量控制(QC) 模型的数据质量概述。

Assurance lifecycle activities include:
•     analytical instrument andsystem qualification分析仪器和系统确认
•     application software validation应用软件验证
•     calibration over theoperational ranges of critical measurement functions在关键测量功能的工作范围内进行校准
•     maintenance and change control维护和变更控制
•     trend analysis to monitor anongoing state of control. 用于监控持续控制状态的趋势分析
The second component of data integrity is avalidated or verified analytical procedure performed by a trained analyst.

Analytical procedure integrity
It has been a requirement for more than 20years that analytical methods and procedures need to be validated or verified(6–7). Recently, these requirements have been updatedby the International Council for Harmonisation (ICH) and a new guideline onanalytical procedure development issued (8–9). USPGeneral Chapter <1220> on Analytical Procedure Lifecycle should beconsulted (10).
20 多年来,分析方法和程序需要得到验证或验证 (6–7) 一直是一项要求。最近,ICH更新了这些要求,并发布了新的分析程序开发指南(8-9)。美国药典应查阅关于分析程序生命周期的通则<1220> (10)。
Critical lifecycle activities include:
&#8226;         analytical target profile anddevelopment分析目标概况和开发
&#8226;         qualification and verification确认和核实?
&#8226;         sample management andpreparation样品管理和制备
&#8226;         use of reference standards标准物质的使用
&#8226;         trained analysts and secondperson review训练有素的分析师和第二人审查
&#8226;         ongoing performanceverification持续的性能核实
&#8226;         deviation management and changecontrol. 偏差管理和变更控制。
Particular attention should be takenregarding analyst training and second person review within the laboratory.
Second person review is essential inensuring data quality.

The Pharmaceutical Quality System
Quality control is the guardian ofscientific soundness, whereas the quality assurance function is the guardian ofcompliance. To perform this duty of care, the quality assurance functionrequires a robust and comprehensive quality management system that enshrinesthe elements to provide and perform the necessary quality oversight over thedata lifecycle.
Quality oversight involves both reviewingand auditing activities of QC but also includes the Pharmaceutical QualitySystem implementation itself to ensure that it is up to date (11).
质量监督涉及 QC 的审核和审计活动,但也包括药品质量体系的实施本身,以确保其是最新的(11)。
Quality oversight covers key areas such as:
&#8226;         policies政策
&#8226;         procedures程序
&#8226;         good documentation practice(GDocP) 良好文档管理规范 (GDocP)
&#8226;         training plans and records培训计划和记录
&#8226;         data integrity audits andinvestigations数据完整性审计和调查
&#8226;         records management andarchiving记录管理和存档
&#8226;         second person review (12). 第二人复核(12)。

ALCOA models for data integrity用于数据完整性的 ALCOA 模型
Much has been written on this topic,particularly regarding the three ALCOA models and the meanings of theiracronyms (4). These acronyms are summarized below and illustrated in Figure 4.
关于这个主题已经写了很多文章,特别是关于三种 ALCOA 模型及其首字母缩略词的含义(4)。这些首字母缩略词总结如下,如图 4 所示。

Figure 4: Pictorial representation of thethree ALCOA models.
图 4:三种 ALCOA 模型的示意图。

ALCOA (13)
Attributable 可归属的
It must be possible to identify theindividual or computerized system that performed a recorded task, and when thetask was performed. This also applies to any changes made to records, such as corrections, deletions,and changes, where it is important to know who made a change,when, and why.
Legible 清晰易读的
All data, including any associatedmetadata, should be unambiguously readable throughout the lifecycle. Legibilityalso extends to any changes or modification tothe original data made by an authorized individual so that the original entryis not obscured.
Contemporaneous 同步的
Data should be recorded on paper orelectronically at the time the observation is made. All data entries must bedated and signed by the person entering the data.
Original 原始的
The original record is the first capture ofinformation, whether recorded on paper (static) or electronically (usuallydynamic, depending on the complexity of the system). Data or informationoriginally captured in a dynamic state remain in that state.
Accurate 准确的
Records need to be a truthfulrepresentation of facts to be accurate. No errors in the originalobservation(s) and no editing are allowed without documented amendments or audit trail entries by authorizedpersonnel. Accuracy is assured and verified by a documented review includingreview of audit trails.

ALCOA+ (14)
Complete 完整的
All data from an analysis, including anydata generated including original data, data before and after repeat testing,reanalysis, modification, recalculation, reintegration, and deletion. Forhybrid systems, the paper output must be linked to the underlying electronicrecords used to produce it. A complete record of data generated electronicallyincludes relevant metadata.
Consistent 连续的
Data and information records should becreated, processed, and stored in a logical manner that has a definedconsistency. This includes policies or procedures that help control orstandardize data (such as chronological sequencing, date formats, units ofmeasurement, approaches to rounding, significant digits, etc.).
Enduring 耐久的
Data are recorded in a permanent,maintainable, authorized media form during the retention period. Records shouldbe kept in a manner such that they continue to exist and are accessible for theentire period during which they are needed. They need to remain intact as anindelible and durable record throughout the record retention period.
Available 可用可及的
Records should be available for review atany time during the required retention period, accessible in a readable formatto all applicable personnel who are responsible for their review, whether forroutine release decisions, investigations, trending, annual reports, audits, orinspections.

ALCOA++ (4,15)
Traceable 可追溯的
Data should be traceable though thelifecycle. Any changes to data or metadata should be explained and should betraceable without obscuring the original information. Timestamps should betraceable to a trusted time source. Metrological standards and instrument orsystem qualification should be traceable to international standards whereverpossible.

Data quality 数据质量
Data quality is a combination of dataintegrity and overall control as part of the pharmaceutical quality system.
An example of a quality control dataquality outline for analysis and testing, using examples from USP, isillustrated in Figure 3.
图 3 显示了用于分析和测试的质量控制数据质量概要示例,其中使用了 USP 中的示例。

Data quality cannot be assured without adata governance structure supported by a qualified IT infrastructure serviceswith cybersecurity and access control.
如果没有由具有网络安全和访问控制的已确认 IT 基础设施服务支持的数据治理结构,就无法保证数据质量。
The proposed Lego brick model provides astructural framework for assuring data quality over the lifecycle.
A short glossary of definitions of keyterms is appended.

Acknowledgements 申明
I wish to thank Bob McDowall and OscarQuatrocchi for their review and helpful comments.
我要感谢 Bob McDowall 和Oscar Quatrocchi 的审阅和有益评论。
Definitions of key terms 关键术语定义
Data governance 数据治理
The totality of arrangements to ensure thatdata, irrespective of the format in which they are generated, are recorded,processed, retained, and used to ensure a complete, consistent, and accuraterecord throughout the data lifecycle (16).
确保数据(无论以何种格式生成)被记录、处理、保留和使用的整体安排,以确保在整个数据生命周期中保持完整、一致和准确的记录 (16)。
Data integrity
Data integrity is the degree to which dataare complete, consistent, accurate, trustworthy, reliable, and that thesecharacteristics of the data are maintained throughout the data lifecycle. Thedata should be collected and maintained in a secure manner, so that they areattributable, legible, contemporaneously recorded, original (or a true copy),and accurate. Assuring data integrity requires appropriate quality and riskmanagement systems, including adherence to sound scientific principles and gooddocumentation practices (16).
Data lifecycle 数据生命周期
All phases in the life of the data(including raw data), from initial generation and recording through processing(including transformation or migration), use, data retention, archive andretrieval, and destruction (17).
数据(包括原始数据)生命周期的所有阶段,从初始生成和记录到处理(包括转换或迁移)、使用、数据保留、存档和检索以及销毁 (17)。
Data quality 数据质量
The assurance that data produced is exactlywhat was intended to be produced and fit for its intended purpose (17).
保证生成的数据正是预期生成的数据,并适合其预期目的 (17)。
Good documentation practices (GDocP) 良好文档实践
Those measures that collectively andindividually ensure documentation, whether paper or electronic, meet datamanagement and integrity principles, for instance, ALCOA+ (17).
那些共同和单独确保文档(无论是纸质还是电子)符合数据管理和完整性原则的措施,例如ALCOA+ (17)。
Metadata 元数据
Data that describe the attributes of otherdata, and provide context and meaning. Typically, these are data that describethe structure, data elements, inter-relationships and other characteristics ofdata, such as audit trails. Metadata also permit data to be attributable to anindividual (or if automatically generated, to the original data source).Metadata form an integral part of the original record. Without the contextprovided by metadata, the data has no meaning (17).
Pharmaceutical Quality System 药品质量体系
A model for an effective quality managementsystem for the pharmaceutical industry to direct and control a pharmaceuticalcompany with regard to quality. (ICH Q10) based upon ISO 9000:2005 (11).
制药行业有效的质量管理体系模型,用于指导和控制制药公司的质量。(ICH Q10)基于 ISO 9000:2005 (11)。
Quality unit(s) 质量系统
Quality units are organizational entitieswithin the pharmaceutical quality system, necessarily independent of each otherand production, that fulfill quality control and quality assurance roles andresponsibilities.
Raw data 原始数据
Raw data is defined asthe original record (data) which can be described as the first capture ofinformation, whether recorded on paper or electronically. Information that isoriginally captured in a dynamic state should remain available in that state. (16).
However, US regulations for good laboratorypractice offer a better definition (18).
然而,美国关于良好实验室实践的法规提供了更好的定义 (18)。
Raw data means any laboratory worksheets,records, memoranda, notes, or exact copies thereof, that are the result oforiginal observations and activities of a nonclinical laboratory study, and arenecessary for the reconstruction and evaluation of the report of that study.

References 参考文献
1. Burgess, C. Never Mind the Statistics;Just Tell Me What the Answer Is! PharmTech.com, March 20, 2023.
2. ECA, Guide for Integrated LifecycleApproach to Analytical Instrument Qualification and System Validation, Version1 (Analytical Quality Control Group, November 2023).
3. ICH, Q9 Quality Risk Management, Step 5Version – Revision 1 (2023).
4. McDowall, R. D. Is Traceability the Gluefor ALCOA, ALCOA+, or ALCOA++? Spectroscopy 2022, 37 (4) 13–19. DOI: 10.56530/spectroscopy.up8185n1
5. USP, USP General Chapter <1058>, “Analytical Instrument Qualification,” USP-NF(Rockville, Md., 2024). DOI: 10.31003/USPNF_M1124_01_01
6. USP. USP General Chapter <1225>, “Validation of Compendial Procedures,” USP-NF(Rockville, Md., 2024). DOI: 10.31003/USPNF_M99945_04_01
7. USP. USP General Chapter <1226>, “Verification of Compendial Procedures,”USP-NF (Rockville, Md., 2024). DOI: 10.31003/USPNF_M870_03_01
8. ICH, Q2(R2) Validation of AnalyticalProcedures, Step 5 Version – Revision 1 (2024).
9. ICH, Q14 Analytical ProcedureDevelopment, Step 5 Version (2024).
10. USP. USP General Chapter <1220>, “Analytical Procedure Lifecycle,” USP-NF(Rockville, Md., 2022).
11. ICH, Q10 Pharmaceutical Quality System,Step 5 Version (2008).
12. Newton, M. E., and McDowall, R. D. DataIntegrity in the Chromatography Laboratory, Part V: Second-Person Review. LCGCNorth Am. 2018, 36 (8) 527–529.
13. Woollen, S. W., “Data Quality and the Origin of ALCOA,” TheCompass Newsletter, Summer 2010.
14. EMA, EMA/INS/GCP/454280/2010,Reflection Paper on Expectations for Electronic Source Data and DataTranscribed to Electronic Data Collection Tools in Clinical Trials (June 9,2010).
15. EMA, EMA/INS/GCP/112288/2023, Guidelineon Computerised Systems and Electronic Data in Clinical Trials (March 9, 2023).
16. MHRA, ‘GXP’ Data Integrity Guidance and Definitions, Revision 1 (March 2018).
17. PIC/S, Good Practices for DataManagement and Integrity in Regulated GMP/GDP Environments (July 2021).
18. CFR Title 21, Part 58 (GovernmentPrinting Office, Washington, DC) 58367–58380.


13712754309 发表于 2024-11-19 09:13


琦瑞福生 发表于 2024-11-19 09:25



歪说歪有李 发表于 2024-11-19 11:52


Hnyy6csya 发表于 2024-11-19 10:03


13712754309 发表于 2024-11-19 09:13


