Calibration layer (Калибровочный слой) A post-prediction adjustment, typically to account for prediction bias. The adjusted predictions and probabilities should match the distribution of an observed set of labels.
Candidate generation (Генерация кандидатов) The initial set of recommendations chosen by a recommendation system. [92].
Candidate sampling (Выборка кандидатов) A training-time optimization in which a probability is calculated for all the positive labels, using, for example, softmax, but only for a random sample of negative labels. For example, if we have an example labeled beagle and dog candidate sampling computes the predicted probabilities and corresponding loss terms for the beagle and dog class outputs in addition to a random subset of the remaining classes (cat, lollipop, fence). The idea is that the negative classes can learn from less frequent negative reinforcement as long as positive classes always get proper positive reinforcement, and this is indeed observed empirically. The motivation for candidate sampling is a computational efficiency win from not computing predictions for all negatives.
Canonical Formats Canonical Formats In information technology, canonicalization is the process of making something [conform] with some specification and is in an approved format. Canonicalization may sometimes mean generating canonical data from noncanonical data. Canonical formats are widely supported and considered to be optimal for long-term preservation. [93]
Capsule neural network (CapsNet) (Капсульная нейронная сеть) A machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. [94] The approach is an attempt to more closely mimic biological neural organization [95]
Case-Based Reasoning (CBR) (Рассуждения по прецедентам) is a way to solve a new problem by using solutions to similar problems. It has been formalized to a process consisting of case retrieve, solution reuse, solution revise, and case retention [96].
Categorical data (Категориальные данные) Features having a discrete set of possible values. For example, consider a categorical feature named house style, which has a discrete set of three possible values: Tudor, ranch, colonial. By representing house style as categorical data, the model can learn the separate impacts of Tudor, ranch, and colonial on house price. Sometimes, values in the discrete set are mutually exclusive, and only one value can be applied to a given example. For example, a car maker categorical feature would probably permit only a single value (Toyota) per example. Other times, more than one value may be applicable. A single car could be painted more than one different color, so a car color categorical feature would likely permit a single example to have multiple values (for example, red and white). Categorical features are sometimes called discrete features. Contrast with numerical data [97].
Center for Technological Competence (Центр технологических компетенций) is an organization that owns the results, tools for conducting fundamental research and platform solutions available to market participants to create applied solutions (products) on their basis. The Technology Competence Center can be a separate organization or be part of an application technology holding company.
Central Processing Units (CPU) (Central Processing Units (CPU) is a von Neumann cyclic processor designed to execute complex computer programs.
Centralized control (Централизованное управление) is a process in which control signals are generated in a single control center and transmitted from it to numerous control objects.
Centroid (Центроид) The center of a cluster as determined by a k-means or k-median algorithm. For instance, if k is 3, then the k-means or k-median algorithm finds 3 centroids.
Centroid-based clustering (Кластеризация на основе центроида) A category of clustering algorithms that organizes data into nonhierarchical clusters. k-means is the most widely used centroid-based clustering algorithm. Contrast with hierarchical clustering algorithms.
Character format Character format
Any file format in which information is encoded as characters using only a standard character-encoding scheme. A file written in character format contains only those bytes that are prescribed in the encoding scheme as corresponding to the characters in the scheme (e.g., alphabetic and numeric characters, punctuation marks, and spaces). [98]
Chatbot (Чат-бот) is a software application designed to simulate human conversation with users via text or speech. Also referred to as virtual agents, interactive agents, digital assistants, or conversational AI, chatbots are often integrated into applications, websites, or messaging platforms to provide support to users without the use of live human agents. Chatbots originally started out by offering users simple menus of choices, and then evolved to react to particular keywords. But humans are very inventive in their use of language, says Forresters McKeon-White. Someone looking for a password reset might say theyve forgotten their access code, or are having problems getting into their account. There are a lot of different ways to say the same thing, he says. This is where AI comes in. Natural language processing is a subset of machine learning that enables a system to understand the meaning of written or even spoken language, even where there is a lot of variation in the phrasing. To succeed, a chatbot that relies on AI or machine learning needs first to be trained using a data set. In general, the bigger the training data set, and the narrower the domain, the more accurate and helpful a chatbot will be [99].
Checkpoint (Контрольная точка) Data that captures the state of the variables of a model at a particular time. Checkpoints enable exporting model weights, as well as performing training across multiple sessions. Checkpoints also enable training to continue past errors (for example, job preemption). Note that the graph itself is not included in a checkpoint.
Chip (Чип) an electronic microcircuit of arbitrary complexity, made on a semiconductor substrate and placed in a non-separable case or without it, if included in the micro assembly.
Class (Класс) One of a set of enumerated target values for a label. For example, in a binary classification model that detects spam, the two classes are spam and not spam. In a multi-class classification model that identifies dog breeds, the classes would be poodle, beagle, pug, and so on.
Classification (Классификация). Classification problems use an algorithm to accurately assign test data into specific categories, such as separating apples from oranges. Or, in the real world, supervised learning algorithms can be used to classify spam in a separate folder from your inbox. Linear classifiers, support vector machines, decision trees and random forest are all common types of classification algorithms.
Classification model (Модель классификации) A type of machine learning model for distinguishing among two or more discrete classes. For example, a natural language processing classification model could determine whether an input sentence was in French, Spanish, or Italian.
Classification threshold (Порог классификации) A scalar-value criterion that is applied to a models predicted score in order to separate the positive class from the negative class. Used when mapping logistic regression results to binary classification.
Clinical Decision Support (CDS) (Поддержка принятия клинических решений) A clinical decision support system is a health information technology system that is designed to provide physicians and other health professionals with clinical decision support, that is, assistance with clinical decision- making tasks [100].
Clipping (Отсечение) A technique for handling outliers. Specifically, reducing feature values that are greater than a set maximum value down to that maximum value. Also, increasing feature values that are less than a specific minimum value up to that minimum value. For example, suppose that only a few feature values fall outside the range 4060. In this case, you could do the following: Clip all values over 60 to be exactly 60. Clip all values under 40 to be exactly 40. In addition to bringing input values within a designated range, clipping can also used to force gradient values within a designated range during training.
Closed dictionary (Закрытый словарь) In speech recognition systems, a dictionary with a limited number of words, to which the recognition system is configured and which cannot be replenished by the user
Cloud (Облако) The cloud is a general metaphor that is used to refer to the Internet. Initially, the Internet was seen as a distributed network and then with the invention of the World Wide Web as a tangle of interlinked media. As the Internet continued to grow in both size and the range of activities it encompassed, it came to be known as the cloud. The use of the word cloud may be an attempt to capture both the size and nebulous nature of the Internet [101].
Cloud computing (Облачные вычисления) is an information technology model for providing ubiquitous and convenient access using the Internet to a common set of configurable computing resources (cloud), data storage devices, applications and services that can be quickly provided and released from the load with minimal operating costs or with little or no involvement of the provider.
Cloud robotics (Облачная робототехника) A field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centred on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent (other machines, smart objects, humans, etc.). Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low cost, smarter robots have intelligent brain in the cloud. The brain consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc. [102]
Cloud TPU (Облачный процессор) A specialized hardware accelerator designed to speed up machine learning workloads on Google Cloud Platform [103]
Cluster analysis (Кластерный анализ) A type of unsupervised learning used for exploratory data analysis to find hidden patterns or groupings in the data; clusters are modeled with a similarity measure defined by metrics such as Euclidean or probability distance.
Clustering (Кластеризация) is a data mining technique for grouping unlabeled data based on their similarities or differences. For example, K-means clustering algorithms assign similar data points into groups, where the K value represents the size of the grouping and granularity. This technique is helpful for market segmentation, image compression, etc.
Co-adaptation (Коадаптация) When neurons predict patterns in training data by relying almost exclusively on outputs of specific other neurons instead of relying on the networks behavior as a whole. When the patterns that cause co-adaption are not present in validation data, then co-adaptation causes overfitting. Dropout regularization reduces co-adaptation because dropout ensures neurons cannot rely solely on specific other neurons.
Cobweb (Метод Cobweb (Метод Cobweb (Метод An incremental system for hierarchical conceptual clustering. COBWEB was invented by Professor Douglas H. Fisher, currently at Vanderbilt University. COBWEB incrementally organizes observations into a classification tree. Each node in a classification tree represents a class (concept) and is labeled by a probabilistic concept that summarizes the attribute-value distributions of objects classified under the node. This classification tree can be used to predict missing attributes or the class of a new object.
Code (Код) is a one-to-one mapping of a finite ordered set of symbols belonging to some finite alphabet.
Codec CodecA codec is the means by which sound and video files are compressed for storage and transmission purposes. There are various forms of compression: lossy and lossless, but most codecs perform lossless compression because of the much larger data reduction ratios that occur [with lossy compression]. Most codecs are software, although in some areas codecs are hardware components of image and sound systems. Codecs are necessary for playback, since they uncompress [or decompress] the moving image and sound files and allow them to be rendered. [104]
Cognitive architecture (Когнитивная архитектура) The Institute of Creative Technologies defines cognitive architecture as: hypothesis about the fixed structures that provide a mind, whether in natural or artificial systems, and how they work together in conjunction with knowledge and skills embodied within the architecture to yield intelligent behavior in a diversity of complex environments
Cognitive computing (Когнитивные вычисления) is used to refer to the systems that simulate the human brain to help with the decision- making. It uses self-learning algorithms that perform tasks such as natural language processing, image analysis, reasoning, and human computer interaction. Examples of cognitive systems are IBMs Watson and Google DeepMind [105]
Cognitive Maps (Когнитивные карты) Cognitive maps are structured representations of decision depicted in graphical format (variations of cognitive maps are cause maps, influence diagrams, or belief nets). Basic cognitive maps include nodes connected by arcs, where the nodes represent constructs (or states) and the arcs represent relationships. Cognitive maps have been used to understand decision situations, to analyze complex cause-effect representations and to support communication. [106]