Глоссариум по искусственному интеллекту: 2500 терминов. Том 2

Counterfactual fairness is a fairness metric that checks whether a classifier produces the same result for one individual as it does for another individual who is identical to the first, except with respect to one or more sensitive attributes. Evaluating a classifier for counterfactual fairness is one method for surfacing potential sources of bias in a model. See «When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness» for a more detailed discussion of counterfactual fairness[310 - Counterfactual fairness [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#counterfactual-fairness (дата обращения: 04.05.2023)].

Coverage bias – this bias means that the study sample is not representative and that the data set in the array has zero chance of being included in the sample[311 - Coverage bias [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#coverage-bias (дата обращения: 30.06.2023)].

Crash blossom is a sentence or phrase with an ambiguous meaning. Crash blossoms present a significant problem in natural language understanding. For example, the headline Red Tape Holds Up Skyscraper is a crash blossom because an NLU model could interpret the headline literally or figuratively[312 - Crash blossom [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#crash-blossom (дата обращения: 09.04.2023)].

Critic – synonym for Deep Q-Network[313 - Critic [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#critic (дата обращения: 04.05.2023)].

Critical information infrastructure – objects of critical information infrastructure, as well as telecommunication networks used to organize the interaction of such objects[314 - Критическая информационная инфраструктура [Электронный ресурс] https://it-enigma.ru URL: https://it-enigma.ru/about/news/chto-takoe-kriticheskaya-informaczionnaya-infrastruktura-(kii)) (дата обращения: 12.07.2023)].

Critical information infrastructure of the Russian Federation is a set of critical information infrastructure objects, as well as telecommunication networks used to organize the interaction of critical information infrastructure objects with each other[315 - Критическая информационная инфраструктура РФ [Электронный ресурс] http://government.ru URL: http://government.ru/docs/all/112572/ ФЗ №187 от 26.07.2017 «О безопасности критической информационной инфраструктуры РФ» (дата обращения: 04.05.2023)].

Cross-entropy is a generalization of Log Loss to multi-class classification problems. Cross-entropy quantifies the difference between two probability distributions. See also perplexity[316 - Cross-entropy [Электронный ресурс] https://helenkapatsa.ru URL: https://www.helenkapatsa.ru/kross-entropiia/ (дата обращения: 16.02.2022)].

Crossover (also recombination) in genetic algorithms and evolutionary computation, a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new solutions from an existing population, and analogous to the crossover that happens during sexual reproduction in biological organisms. Solutions can also be generated by cloning an existing solution, which is analogous to asexual reproduction. Newly generated solutions are typically mutated before being added to the population[317 - Crossover [Электронный ресурс] https://brainly.in URL: https://brainly.in/question/5802477 (дата обращения 28.02.2022)].

Cross-Validation (k-fold Cross-Validation, Leave-p-out Cross-Validation) is a collection of processes designed to evaluate how the results of a predictive model will generalize to new data sets. k-fold Cross-Validation; Leave-p-out Cross-Validation[318 - Перекрёстная проверка [Электронный ресурс] https://ru.wikipedia.org URL: https://ru.wikipedia.org/wiki/Перекрёстная_проверка (дата обращения: 20.06.2023)].

Cryogenic freezing (cryonics, human cryopreservation) is a technology of preserving in a state of deep cooling (using liquid nitrogen) the head or body of a person after his death with the intention to revive them in the future[319 - Криогенная заморозка [Электронный ресурс] https://ru.wikipedia.org URL: https://ru.wikipedia.org/wiki/Крионика (дата обращения: 04.05.2023)].

Cyber-physical systems are intelligent networked systems with built-in sensors, processors and drives that are designed to interact with the physical environment and support the operation of computer information systems in real time[320 - Киберфизические системы [Электронный ресурс] https://ulgov.ru URL: https://ulgov.ru/page/index/permlink/id/14949/ (дата обращения: 02.05.2023)].

«D»

Darkforest is a computer go program, based on deep learning techniques using a convolutional neural network. Its updated version Darkforest2 combines the techniques of its predecessor with Monte Carlo tree search. The MCTS effectively takes tree search methods commonly seen in computer chess programs and randomizes them. With the update, the system is known as Darkforest3[321 - Darkforest [Электронный ресурс] https://en.wikipedia.org URL: https://en.wikipedia.org/wiki/Darkforest (дата обращения: 28.06.2023)].

Dartmouth workshop – the Dartmouth Summer Research Project on Artificial Intelligence was the name of a 1956 summer workshop now considered by many (though not all) to be the seminal event for artificial intelligence as a field[322 - Dartmouth workshop [Электронный ресурс] https://static.hlt.bme.hu URL: https://static.hlt.bme.hu/semantics/external/pages/John_McCarthy/en.wikipedia.org/wiki/Dartmouth_workshop.html (дата обращения: 16.04.2023)].

Data analysis is obtaining an understanding of data by considering samples, measurement, and visualization. Data analysis can be particularly useful when a dataset is first received, before one builds the first model. It is also crucial in understanding experiments and debugging problems with the system[323 - Data analysis [Электронный ресурс] https://dic.academic.ru URL: https://dic.academic.ru/dic.nsf/ruwiki/1727524 (дата обращения: 16.02.2022)].

Data analytics is the science of analyzing raw data to make conclusions about that information. Many of the techniques and processes of data analytics have been automated into mechanical processes and algorithms that work over raw data for human consumption[324 - Data analytics [Электронный ресурс] www.investopedia.com (дата обращения: 07.07.2022) URL: https://www.investopedia.com/terms/d/data-analytics.asp].

Data augmentation in data analysis are techniques used to increase the amount of data. It helps reduce overfitting when training a machine learning[325 - Data augmentation [Электронный ресурс] https://ibm.com URL: https://www.ibm.com/docs/ru/oala/1.3.5?topic=SSPFMY_1.3.5/com.ibm.scala.doc/config/iwa_cnf_scldc_scl_dc_ovw.html (дата обращения: 18.02.2022)].

Data Cleaning is the process of identifying, correcting, or removing inaccurate or corrupt data records[326 - Очистка данных [Электронный ресурс] https://ru.wikipedia.org URL: https://ru.wikipedia.org/wiki/Очистка_данных (дата обращения: 20.06.2023)].

Data Curation – includes the processes related to the organization and management of data which is collected from various sources[327 - Data Curation [Электронный ресурс] www.geeksforgeeks.org URL: https://www.geeksforgeeks.org/data-curation-lifecycle/ (дата обращения 22.02.2022)].

Data entry – the process of converting verbal or written responses to electronic form[328 - Data entry [Электронный ресурс] www.umich.edu URL: https://www.icpsr.umich.edu/web/ICPSR/cms/2042#D (дата обращения: 07.07.2022)].

Data fusion — the process of integrating multiple data sources to produce more consistent, accurate, and useful information than that provided by any individual data source[329 - Data fusion [Электронный ресурс] www.researchgate.net URL: https://www.researchgate.net/post/what_is_the_difference_between_Data_integration_and_data_fusion (дата обращения 14.03.2022)].

Data Integration involves the combination of data residing in different resources and then the supply in a unified view to the users. Data integration is in high demand for both commercial and scientific domains in which they need to merge the data and research results from different repositories[330 - Data Integration [Электронный ресурс] https://ibm.com URL: https://www.ibm.com/ru-ru/analytics/data-integration (дата обращения: 18.02.2022)].

Data is a collection of qualitative and quantitative variables. It contains the information that is represented numerically and needs to be analyzed.

Data Lake is a type of data repository that stores data in its natural format and relies on various schemata and structure to index the data[331 - Data Lake [Электронный ресурс] https://bigdataschool.ru URL: https://www.bigdataschool.ru/wiki/data-lake (дата обращения: 17.02.2022)].

Data markup is the stage of processing structured and unstructured data, during which data (including text documents, photo and video images) are assigned identifiers that reflect the type of data (data classification), and (or) data is interpreted to solve a specific problem, in including using machine learning methods (National Strategy for the Development of Artificial Intelligence for the period up to 2030)[332 - Разметка данных [Электронный ресурс] https://cdto.wiki URL: https://cdto.wiki/Разметка_данных Указ Президента РФ от 10.10.2019 №490 «О развитии искусственного интеллекта в РФ» (дата обращения: 29.06.2023)].

Data Mining is the process of data analysis and information extraction from large amounts of datasets with machine learning, statistical approaches. and many others[333 - Data Mining [Электронный ресурс] https://bigdataschool.ru URL: https://www.teradata.ru/Glossary/What-is-Data-Mining (дата обращения: 17.02.2022)].

Data parallelism is a way of scaling training or inference that replicates an entire model onto multiple devices and then passes a subset of the input data to each device. Data parallelism can enable training and inference on very large batch sizes; however, data parallelism requires that the model be small enough to fit on all devices. See also model parallelism[334 - Data parallelism [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#data-parallelism (дата обращения: 20.06.2023)].

Data Processing Unit (DPU) is a programmable specialized electronic circuit with hardware accelerated data processing for data-oriented computing[335 - Data Processing Unit (DPU) [Электронный ресурс] https://en.wikipedia.org URL: https://en.wikipedia.org/wiki/Data_processing_unit (дата обращения: 11.07.2023)].

Data protection is the process of protecting data and involves the relationship between the collection and dissemination of data and technology, the public perception and expectation of privacy and the political and legal underpinnings surrounding that data. It aims to strike a balance between individual privacy rights while still allowing data to be used for business purposes[336 - Data protection [Электронный ресурс] www.techopedia.com URL: https://www.techopedia.com/definition/29406/data-protection (дата обращения: 07.07.2022)].

Data Refinement is used to convert an abstract data model in terms of sets for example into implementable data structures such as arrays[337 - Data Refinement [Электронный ресурс] www.atscale.com URL: https://www.atscale.com/blog/what-is-data-extraction/ (дата обращения 12.01.2022)].

Data Science is a broad grouping of mathematics, statistics, probability, computing, data visualization to extract knowledge from a heterogeneous set of data (images, sound, text, genomic data, social network links, physical measurements, etc.). The methods and tools derived from artificial intelligence are part of this family[338 - Data Science [Электронный ресурс] https://www.coe.int URL: https://www.coe.int/en/web/artificial-intelligence/glossary (дата обращения: 10.05.2023)],[339 - Наука о данных [Электронный ресурс] https://www.tadviser.ru URL: https://www.tadviser.ru/index.php/Статья:Наука_о_данных_(Data_Science) (дата обращения: 10.05.2023)].

Data set is a set of data that has undergone preliminary preparation (processing) in accordance with the requirements of the legislation of the Russian Federation on information, information technology and information protection and is necessary for the development of software based on artificial intelligence (National strategy for the development of artificial intelligence for the period up to 2030)[340 - Набор данных [Электронный ресурс] http://static.kremlin.ru URL: http://static.kremlin.ru/media/events/files/ru/AH4x6HgKWANwVtMOfPDhcbRpvd1HCCsv.pdf I. Общие положения Указа Президента РФ №490 от 10.10.2019 г. «О развитии искусственного интеллекта в РФ» (дата обращения: 10.05.2023)].

Data Streaming Accelerator (DSA) is a device that performs a specific task, which in this case is the transfer of data in less time than the CPU would do. What makes DSA special is that it is designed for one of the characteristics that Compute Express Link brings with it over PCI Express 5.0, which is to provide consistent access to RAM for all peripherals connected to a PCI Express port, i.e., they use the same memory addresses.

Data variability describes how far apart data points lie from each other and from the center of a distribution. Along with measures of central tendency, measures of variability give you descriptive statistics that summarize your data[341 - Data variability [Электронный ресурс] www.investopedia.com URL: https://www.investopedia.com/terms/v/variability.asp (дата обращения: 07.07.2022)].

Data veracity is the degree of accuracy or truthfulness of a data set. In the context of big data, its not just the quality of the data that is important, but how trustworthy the source, the type, and processing of the data are[342 - Data veracity [Электронный ресурс] https://datafloq.com URL: https://datafloq.com/read/data-veracity-new-key-big-data/ (дата обращения: 07.07.2022)].

Data Warehouse is typically an offline copy of production databases and copies of files in a non-production environment[343 - Data Warehouse [Электронный ресурс] www.interviewbit.com URL: https://www.interviewbit.com/blog/characteristics-of-data-warehouse/ (дата обращения 14.03.2018)].

Database is a «container» storing data such as numbers, dates or words, which can be reprocessed by computer means to produce information; for example, numbers and names assembled and sorted to form a directory[344 - Database [Электронный ресурс] https://www.coe.int URL: https://www.coe.int/en/web/artificial-intelligence/glossary (дата обращения: 28.03.2023)].

DataFrame is a popular datatype for representing datasets in pandas. A DataFrame is analogous to a table. Each column of the DataFrame has a name (a header), and each row is identified by a number[345 - DataFrame [Электронный ресурс] https://pynative.com URL: https://pynative.com/python-pandas-dataframe/ (дата обращения 22.02.2022)].

Datalog is a declarative logic programming language that syntactically is a subset of Prolog. It is often used as a query language for deductive databases. In recent years, Datalog has found new application in data integration, information extraction, networking, program analysis, security, and cloud computing[346 - Datalog [Электронный ресурс] www.definitions.net URL: https://www.definitions.net/definition/Datalog (дата обращения 30.04.2020)].

Datamining – the discovery, interpretation, and communication of meaningful patterns in data[347 - Datamining [Электронный ресурс] https://bellintegrator.ru URL: https://bellintegrator.ru/ArtificialIntelligence/Data-Mining (дата обращения: 19.02.2022)].

Dataset API (tf. data) is a high-level TensorFlow API for reading data and transforming it into a form that a machine learning algorithm requires. A tf. data. Dataset object represents a sequence of elements, in which each element contains one or more Tensors. A tf.data.Iterator object provides access to the elements of a Dataset. For details about the Dataset API, see Importing Data in the TensorFlow Programmer’s Guide[348 - Dataset API (tf. data) [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary/tensorflow#dataset-api-tf.data (дата обращения: 27.03.2023)].

Debugging is the process of finding and resolving bugs (defects or problems that prevent correct operation) within computer programs, software, or systems. Debugging tactics can involve interactive debugging, control flow analysis, unit testing, integration testing, log file analysis, monitoring at the application or system level, memory dumps, and profiling. Many programming languages and software development tools also offer programs to aid in debugging, known as debuggers[349 - Debugging [Электронный ресурс] https://en.wikipedia.org URL: https://en.wikipedia.org/wiki/Debugging (дата обращения: 07.07.2022)].

Decentralized applications (dApps) are digital applications or programs that exist and run on a blockchain or peer-to-peer (P2P) network of computers instead of a single computer. DApps (also called «dapps») are outside the purview and control of a single authority. DApps – which are often built on the Ethereum platform – can be developed for a variety of purposes including gaming, finance, and social media[350 - Decentralized applications (dApps) [Электронный ресурс] www.investopedia.com URL: https://www.investopedia.com/terms/d/decentralized-applications-dapps.asp (дата обращения: 07.07.2022)].

Decentralized control is a process in which a significant number of control actions related to a given object are generated by the object itself on the basis of self-government[351 - Децентрализованное управление [Электронный ресурс] https://be5.biz URL: https://be5.biz/ekonomika/u001/09.html (дата обращения: 09.04.2023)].

Decision boundary – the separator between classes learned by a model in a binary class or multi-class classification problems[352 - Decision boundary [Электронный ресурс] https://developers.google.com URL: https://developers.google.com/machine-learning/glossary#decision-boundary (дата обращения: 28.03.2023)].

Decision boundary in the case of backpropagation-based artificial neural networks or perceptrons, the type of decision boundary that the network can learn is determined by the number of hidden layers the network has. If it has no hidden layers, then it can only learn linear problems. If it has one hidden layer, then it can learn any continuous function on compact subsets of Rn as shown by the Universal approximation theorem, thus it can have an arbitrary decision boundary.

Decision intelligence (DI) is a practical discipline used to improve the decision making process by clearly understanding and programmatically developing how decisions are made and how the outcomes are evaluated, managed and improved through feedback.

Decision intelligence is a discipline offers a framework to assist data and analytics practitioners develop, model, align, implement, track, and modify decision models and processes related to business results and performance[353 - Decision intelligence [Электронный ресурс] https://www.simplilearn.com URL: https://www.simplilearn.com/decision-intelligence-article (дата обращения: 27.03.2023)].

Decision support system (DSS) is an information system that supports business or organizational decision-making activities. DSSs serve the management, operations and planning levels of an organization (usually mid and higher management) and help people make decisions about problems that may be rapidly changing and not easily specified in advance – i.e., unstructured and semi-structured decision problems. Decision support systems can be either fully computerized or human-powered, or a combination of both[354 - Decision support system (DSS) [Электронный ресурс] https://en.wikipedia.org URL: https://en.wikipedia.org/wiki/Decision_support_system (дата обращения: 30.06.2023)].

Decision theory (also theory of choice) – the study of the reasoning underlying an agent’s choices. Decision theory can be broken into two branches: normative decision theory, which gives advice on how to make the best decisions given a set of uncertain beliefs and a set of values, and descriptive decision theory which analyzes how existing, possibly irrational agents actually make decisions[355 - Decision theory [Электронный ресурс] https://static.hlt.bme.hu URL: https://static.hlt.bme.hu/semantics/external/pages/Arrow_lehetetlensеgi_tеtel/en.wikipedia.org/wiki/Decision_theory.html (дата обращения: 03.07.2023)].