REFERENCE

AI Glossary: 500+ Terms Defined

The most comprehensive AI glossary available. Over 500 terms covering technical concepts, organizations, regulatory frameworks, ethical principles, and industry jargon — defined clearly for researchers, policymakers, journalists, and informed citizens.

INHUMAIN.AI Editorial · February 26, 2026 · 25 min read

Understanding artificial intelligence requires understanding its language. The field has developed a specialized vocabulary that spans computer science, neuroscience, philosophy, law, and political economy. This glossary exists because clarity of language is a prerequisite for clarity of thought, and clarity of thought is a prerequisite for meaningful oversight.

Every term is defined in plain language. Where a term has contested meanings or is used differently across communities, we note the disagreement. Where a term is weaponized as jargon to obscure rather than illuminate, we say so.

This glossary is cross-referenced with our AI Safety Complete Guide, AI Regulation Tracker, and AI Incident Tracker. Terms are organized alphabetically. Use your browser’s search function (Ctrl+F / Cmd+F) to find specific terms quickly.

A

A/B Testing (in AI) — The practice of comparing two model variants by deploying them simultaneously to different user groups and measuring performance differences. Widely used in production AI systems to evaluate changes before full rollout.

Ablation Study — An experiment that systematically removes components of a model or system to determine each component’s contribution to overall performance. Essential for understanding which parts of a model are actually doing useful work.

Abstraction — The process by which a model learns to represent complex input data at higher levels of conceptual organization. Deep learning models build hierarchies of abstraction, from pixel-level features to object-level concepts.

Accelerationism (AI) — The ideological position that AI development should be pursued as rapidly as possible, with minimal regulatory interference, on the grounds that the benefits will outweigh the risks. Sometimes abbreviated as e/acc (effective accelerationism). Criticized by safety researchers as reckless.

Accuracy — The proportion of correct predictions made by a model out of total predictions. A misleading metric when classes are imbalanced — a model that always predicts “not cancer” achieves 99% accuracy if only 1% of cases are cancerous.

Activation Function — A mathematical function applied to a neuron’s output that introduces non-linearity into the network. Common examples include ReLU, sigmoid, and tanh. Without activation functions, a neural network would be equivalent to a single linear transformation regardless of depth.

Active Learning — A machine learning approach where the model can query a human oracle to label specific data points it is most uncertain about. Reduces the amount of labeled data required for training by focusing human effort where it matters most.

Adversarial Attack — A deliberate attempt to cause an AI system to make errors by providing carefully crafted inputs. Adversarial examples exploit systematic vulnerabilities in how models process data. See also: AI Incident Tracker.

Adversarial Robustness — A model’s ability to maintain correct behavior when subjected to adversarial attacks. One of the core challenges in AI safety.

Adversarial Training — A training technique where adversarial examples are included in the training data to improve model robustness. The model learns to correctly classify both clean and adversarially perturbed inputs.

Agent (AI Agent) — An AI system that can perceive its environment, make decisions, and take actions to achieve goals with some degree of autonomy. The term ranges from simple chatbot tool use to hypothetical fully autonomous AI systems.

Agentic AI — AI systems designed to operate with extended autonomy, executing multi-step tasks, using tools, and making decisions without continuous human oversight. A major focus of development in 2025-2026 and a significant safety concern.

AGI (Artificial General Intelligence) — A hypothetical AI system capable of performing any intellectual task that a human can perform, with the ability to transfer knowledge across domains. No consensus exists on whether current approaches can achieve AGI or what it would look like. See AI Prediction Scorecard for AGI timeline predictions.

AI Alignment — The challenge of ensuring that an AI system’s goals, behaviors, and values match human intentions. The central problem of AI safety. Subdivided into outer alignment (specifying the right objective) and inner alignment (ensuring the system pursues it).

AI Arms Race — Competition between nations or corporations to develop the most advanced AI capabilities, often at the expense of safety considerations. The US-China AI competition and the HUMAIN initiative in Saudi Arabia are prominent examples.

AI Audit — A systematic evaluation of an AI system’s behavior, fairness, safety, and compliance with applicable regulations. Increasingly required under frameworks like the EU AI Act.

AI Bill of Rights (US) — The Blueprint for an AI Bill of Rights, published by the White House Office of Science and Technology Policy in October 2022. Non-binding principles including safe and effective systems, algorithmic discrimination protections, data privacy, notice and explanation, and human alternatives. See AI Regulation Tracker.

AI Chip — Specialized semiconductor hardware designed for AI workloads. Includes GPUs (NVIDIA), TPUs (Google), custom ASICs, and neuromorphic chips. Subject to export controls that shape global AI competition.

AI Ethics — The branch of applied ethics examining moral questions raised by the development and deployment of AI systems. Covers bias, fairness, transparency, accountability, privacy, autonomy, and the distribution of benefits and harms.

AI Governance — The frameworks, institutions, processes, and norms that shape how AI systems are developed, deployed, and regulated. Operates at organizational, national, and international levels.

AI Hallucination — See Hallucination.

AI Literacy — The ability of individuals to understand, critically evaluate, and interact with AI systems. Mandated as a requirement for deployers and affected persons under the EU AI Act.

AI Office (EU) — The European AI Office, established within the European Commission to oversee implementation and enforcement of the EU AI Act, particularly for general-purpose AI models with systemic risk.

AI Safety — The field dedicated to ensuring AI systems are beneficial, controllable, and do not cause unintended harm. Encompasses technical research (alignment, interpretability, robustness) and governance. See our complete guide.

AI Safety Institute (AISI) — Government-backed institutions focused on AI safety research and evaluation. The UK AISI (founded 2023) and US AISI (founded 2024) are the most prominent. Several countries have announced similar bodies.

AI Sovereignty — A nation’s capacity to develop, deploy, and regulate AI systems independently, without dependence on foreign technology providers. A driving motivation behind HUMAIN and similar national AI programs.

AI Washing — The practice of exaggerating or fabricating AI capabilities in products or services for marketing purposes. The SEC has begun enforcement actions against companies making misleading AI claims.

AI Winter — A historical period of reduced funding and interest in AI research, typically following a cycle of inflated expectations. Previous AI winters occurred in the 1970s and late 1980s. Some analysts question whether current investment levels are sustainable.

AIAAIC Repository — The AI, Algorithmic, and Automation Incidents and Controversies repository, a comprehensive database of AI incidents maintained by Charlie Pownall. Referenced in our AI Incident Tracker.

Algorithm — A finite sequence of well-defined instructions for solving a problem or performing a computation. In AI contexts, refers broadly to the mathematical procedures used for training and inference.

Alignment Tax — The additional computational cost, development time, or capability reduction imposed by implementing safety measures in AI systems. A key concern in AI policy: if the alignment tax is too high, developers have incentives to skip safety work.

Anthropic — An AI safety company founded in 2021 by former OpenAI researchers Dario and Daniela Amodei. Develops the Claude family of language models. Known for research on constitutional AI and mechanistic interpretability.

API (Application Programming Interface) — A set of protocols and tools that allow software applications to communicate with each other. AI models are commonly accessed through APIs that accept inputs and return outputs without exposing model internals.

ASI (Artificial Superintelligence) — A hypothetical AI system that vastly exceeds human cognitive abilities across all domains. Distinguished from AGI by the degree of capability surplus. The subject of significant existential risk analysis.

Attention Mechanism — A neural network component that allows the model to focus on different parts of the input when producing each part of the output. The key innovation behind transformer architectures. Self-attention allows a model to weigh the relevance of every input token to every other token.

Autoencoder — A neural network trained to compress input data into a lower-dimensional representation and then reconstruct the original input. Used for dimensionality reduction, anomaly detection, and generative modeling.

Autonomous Weapon System (AWS) — A weapon system that can select and engage targets without human intervention. The subject of ongoing international debate at the UN Convention on Certain Conventional Weapons. A major factor in the AI Doomsday Clock assessment.

Autoregressive Model — A model that generates output sequentially, with each element conditioned on previously generated elements. GPT-family models are autoregressive: they predict the next token based on all preceding tokens.

B

Backpropagation — The algorithm used to train neural networks by computing the gradient of the loss function with respect to each weight. Works by propagating error signals backward through the network from output to input, enabling efficient gradient computation.

Batch Normalization — A technique that normalizes layer inputs during training to stabilize and accelerate learning. Reduces internal covariate shift and allows higher learning rates.

Batch Size — The number of training examples processed together before updating model weights. Affects training speed, memory requirements, and generalization performance.

Bayesian Optimization — A strategy for optimizing expensive-to-evaluate functions by building a probabilistic model of the objective and using it to select the most promising points to evaluate. Commonly used for hyperparameter tuning.

BERT (Bidirectional Encoder Representations from Transformers) — A language model architecture developed by Google (2018) that processes text bidirectionally, considering both left and right context simultaneously. Foundational for many NLP tasks including search, classification, and question answering.

Bias (Statistical) — The systematic error introduced when a model’s assumptions do not match reality. High bias leads to underfitting, where the model is too simple to capture underlying patterns.

Bias (Societal) — Systematic unfairness in AI system outputs that disadvantages particular demographic groups. Can originate from training data, model architecture, evaluation metrics, or deployment context. A major category of AI incidents.

Bigram — A sequence of two adjacent elements (usually words or characters) in text. Used in simple language models and as a baseline for evaluating more complex models.

Binary Classification — A prediction task with exactly two possible outcomes (e.g., spam/not spam, malignant/benign). The simplest form of classification.

Bitter Lesson — Rich Sutton’s influential 2019 essay arguing that general methods leveraging computation (search and learning) have historically outperformed methods leveraging human knowledge. Often cited to justify scaling approaches over hand-engineering.

Black Box — A system whose internal workings are opaque to external observers. Most deep learning models are black boxes: we can observe inputs and outputs but cannot easily understand the intermediate reasoning. The core challenge motivating interpretability research.

Boltzmann Machine — A type of stochastic neural network that learns probability distributions over its inputs. Largely superseded by modern architectures but historically important in the development of deep learning.

Bostrom, Nick — Swedish philosopher at Oxford, author of “Superintelligence: Paths, Dangers, Strategies” (2014), which brought existential risk from AI into mainstream academic and public discourse.

Bottleneck Layer — A neural network layer with fewer neurons than surrounding layers, forcing the network to compress information. Used in autoencoders and architectures that require dimensionality reduction.

C

Catastrophic Forgetting — The tendency of neural networks to abruptly lose previously learned knowledge when trained on new data. A major challenge for continual learning systems and a reason why fine-tuning must be done carefully.

Chain-of-Thought (CoT) Prompting — A technique where a language model is prompted to show its reasoning step by step before arriving at a final answer. Significantly improves performance on complex reasoning tasks. Variants include zero-shot CoT (“Let’s think step by step”) and few-shot CoT (providing worked examples).

Chatbot — A software application that simulates human conversation. Ranges from simple rule-based systems to sophisticated large language model implementations like ChatGPT, Claude, and Gemini. See AI Tools Database.

ChatGPT — OpenAI’s conversational AI product, launched November 30, 2022. Built on GPT-series models fine-tuned with RLHF. Reached 100 million users within two months, making it the fastest-growing consumer application in history at the time.

Checkpoint — A saved snapshot of a model’s parameters during training. Allows training to be resumed from a specific point and enables selection of the best-performing model version.

CHIPS Act — The CHIPS and Science Act (2022), US legislation providing $52.7 billion in subsidies for domestic semiconductor manufacturing. Directly impacts AI compute availability by incentivizing domestic chip production.

Classification — A supervised learning task where the model assigns input data to one of a predefined set of categories. Examples include image classification, spam detection, and sentiment analysis.

Claude — Anthropic’s family of large language models, designed with a focus on safety through constitutional AI. Named after Claude Shannon, the founder of information theory.

Clustering — An unsupervised learning technique that groups similar data points together based on shared characteristics. Common algorithms include k-means, DBSCAN, and hierarchical clustering.

CNN (Convolutional Neural Network) — A neural network architecture specialized for processing grid-like data such as images. Uses convolutional layers that apply learned filters across the input, capturing spatial hierarchies of features.

Compute — The computational resources (processing power, memory, time) required to train and run AI models. Compute has become the primary bottleneck and competitive advantage in AI development, with frontier model training runs costing tens to hundreds of millions of dollars.

Compute Governance — Policy approaches that regulate AI development through control over computational resources rather than (or in addition to) direct regulation of algorithms. Includes export controls on AI chips and licensing requirements for large training runs.

Concept Drift — The phenomenon where the statistical properties of the data a model was trained on change over time, degrading model performance. Requires ongoing monitoring and retraining in production systems.

Confabulation — See Hallucination. Some researchers prefer this term as more technically accurate, arguing that “hallucination” incorrectly implies a perceptual experience.

Conformity Assessment — Under the EU AI Act, the process by which high-risk AI systems must be evaluated for compliance before being placed on the market. Can be self-assessed or require third-party audit depending on the system category.

Constitutional AI (CAI) — A training methodology developed by Anthropic where an AI model is trained to follow a set of explicit principles (a “constitution”) rather than relying solely on human feedback for each output. The model critiques and revises its own outputs based on these principles.

Context Window — The maximum amount of text (measured in tokens) that a language model can process in a single interaction. As of early 2026, context windows range from 8K tokens to over 1 million tokens for frontier models. Larger context windows enable processing of longer documents but increase computational costs.

Contrastive Learning — A self-supervised learning approach that trains models by encouraging them to produce similar representations for related inputs and dissimilar representations for unrelated inputs.

Control Problem — The challenge of maintaining meaningful human control over AI systems as they become more capable. Encompasses both technical problems (corrigibility, shutdown safety) and governance problems (oversight, accountability).

Convergent Instrumental Goals — See Instrumental Convergence.

Corpus — A large, structured collection of text used for training language models. Common corpora include Common Crawl, The Pile, and RedPajama.

Corrigibility — The property of an AI system that allows humans to modify, correct, or shut it down without the system resisting such interventions. A desirable property for safe AI systems but technically difficult to guarantee in highly capable systems.

Cross-Entropy Loss — A loss function commonly used for classification tasks that measures the difference between the predicted probability distribution and the true distribution. The standard training objective for language models.

Cross-Validation — A technique for evaluating model performance by splitting data into multiple subsets, training on some and testing on others. Provides more reliable performance estimates than a single train-test split.

CUDA — NVIDIA’s parallel computing platform and API that enables general-purpose computing on GPUs. The dominant software ecosystem for AI training, creating significant vendor lock-in.

D

DALL-E — OpenAI’s text-to-image generation model series. DALL-E 3 (2023) significantly improved prompt adherence and image quality. See AI Tools Database.

Data Augmentation — Techniques for artificially increasing the size and diversity of training data by applying transformations (rotation, cropping, noise injection) to existing examples. Widely used in computer vision.

Data Center — A facility housing computing infrastructure for AI training and inference. AI data center construction is booming, with significant environmental concerns around energy consumption and water usage.

Data Labeling — The process of annotating raw data with categories, tags, or other metadata required for supervised learning. Often performed by low-wage workers in developing countries, raising significant ethical concerns about labor exploitation.

Data Poisoning — An attack where adversaries intentionally introduce corrupted data into a training dataset to manipulate model behavior. A growing security concern as models are trained on increasingly large and unvetted internet data.

Data Privacy — The right of individuals to control how their personal information is collected, stored, and used by AI systems. Governed by regulations including GDPR, CCPA, and increasingly by AI-specific legislation.

Decision Boundary — The surface in feature space that separates different classes in a classification model. The shape and complexity of decision boundaries determine a model’s ability to distinguish between categories.

Decision Tree — A supervised learning model that makes predictions by learning a hierarchy of if-then rules from training data. Interpretable but often less accurate than ensemble methods or deep learning.

Decoder — In transformer architecture, the component that generates output sequences. GPT-family models are decoder-only architectures. In encoder-decoder models (like T5), the decoder attends to both the encoder output and its own previous outputs.

Deep Learning — A subset of machine learning that uses neural networks with multiple layers (depth) to learn hierarchical representations of data. Responsible for most of the advances in AI since 2012.

Deepfake — Synthetic media (typically video or audio) generated using deep learning to convincingly depict events that did not occur. A growing source of AI incidents involving fraud, non-consensual intimate imagery, and political disinformation.

DeepMind — Google’s AI research laboratory, founded in 2010, acquired by Google in 2014. Responsible for AlphaGo, AlphaFold, Gemini, and significant contributions to AI safety research. Now operating as Google DeepMind.

Denoising — The process of removing noise from data. Denoising autoencoders learn to reconstruct clean data from corrupted inputs. Denoising diffusion is the basis for modern image generation models.

Diffusion Model — A generative model that creates data by learning to reverse a gradual noising process. Starting from random noise, the model iteratively denoises to produce coherent outputs. The architecture behind Stable Diffusion, DALL-E 3, and Midjourney.

Dimensionality Reduction — Techniques for reducing the number of features or variables in a dataset while preserving meaningful structure. Includes PCA, t-SNE, and UMAP.

Discriminator — In a GAN (Generative Adversarial Network), the network that attempts to distinguish between real data and generated data. Provides training signal to the generator.

Distillation (Knowledge Distillation) — A technique where a smaller “student” model is trained to replicate the behavior of a larger “teacher” model. Produces more efficient models that retain much of the teacher’s capability at lower computational cost.

Distributed Training — Training a single model across multiple GPUs or machines simultaneously. Required for frontier models that exceed the memory and compute capacity of any single device.

Domain Adaptation — The process of adjusting a model trained on data from one domain to perform well on data from a different but related domain. Important for deploying models in contexts where labeled training data is scarce.

Double Descent — A phenomenon where model performance first improves, then worsens, then improves again as model size increases past the interpolation threshold. Challenges classical statistical learning theory’s bias-variance tradeoff.

Downsampling — Reducing the spatial resolution or quantity of data. In neural networks, pooling layers perform downsampling. In data preprocessing, downsampling can address class imbalance.

DPO (Direct Preference Optimization) — An alternative to RLHF for aligning language models with human preferences. DPO directly optimizes the language model on preference data without requiring a separate reward model, simplifying the training pipeline.

Dropout — A regularization technique where random neurons are temporarily removed during training. Prevents co-adaptation of neurons and reduces overfitting. One of the most effective and widely used regularization methods.

Dual Use — Technology that can be used for both beneficial and harmful purposes. Most AI capabilities are inherently dual-use, which complicates regulatory approaches that attempt to restrict harmful applications without impeding beneficial ones.

E

Edge Computing — Processing data on devices at the “edge” of the network (smartphones, IoT sensors, local servers) rather than in centralized cloud data centers. Enables lower latency, better privacy, and offline AI capabilities.

Embedding — A dense vector representation of data (text, images, audio) in a continuous vector space, where similar items are mapped to nearby points. Word embeddings capture semantic relationships; image embeddings capture visual similarity. A foundational concept in modern AI.

Emergent Behavior — Capabilities or behaviors in AI systems that were not explicitly programmed or anticipated, arising from the interaction of simpler components or the scale of training. Examples include in-context learning and chain-of-thought reasoning in large language models.

Encoder — In transformer architecture, the component that processes input sequences into contextual representations. BERT is an encoder-only model. The encoder attends to the entire input simultaneously (bidirectionally).

Ensemble Methods — Techniques that combine predictions from multiple models to improve accuracy and robustness. Includes bagging (Random Forest), boosting (XGBoost), and stacking.

Entity Extraction — See Named Entity Recognition.

Epoch — One complete pass through the entire training dataset during model training. Models are typically trained for multiple epochs, though large language models are increasingly trained for less than one epoch on their massive datasets.

Ethical AI — The practice of developing and deploying AI systems in accordance with ethical principles including fairness, transparency, accountability, and respect for human rights. Distinct from (and broader than) the technical field of AI safety.

EU AI Act — Regulation (EU) 2024/1689, the world’s most comprehensive binding AI legislation. Establishes a risk-based classification framework with specific obligations for each tier. See our detailed enforcement guide and regulation tracker.

Evaluation (Model Evaluation) — The process of measuring a model’s performance using metrics, benchmarks, and test datasets. Critical for comparing models, detecting regressions, and assessing deployment readiness.

Existential Risk (X-Risk) — The risk that AI development could lead to human extinction or permanent civilizational collapse. Taken seriously by a growing number of researchers and institutions, though estimates of probability vary widely. A factor in the AI Doomsday Clock assessment.

Explainability — The degree to which the internal mechanics of an AI system can be understood by humans. Closely related to interpretability but sometimes distinguished: explainability may refer to post-hoc explanations of behavior, while interpretability refers to understanding the actual mechanisms.

Exploitation (in RL) — Using the current best-known strategy to maximize reward. Contrasted with exploration (trying new strategies to discover potentially better ones). The exploration-exploitation tradeoff is fundamental to reinforcement learning.

Export Controls — Government restrictions on the transfer of technology across national borders. US export controls on advanced AI chips (NVIDIA A100, H100, and successors) to China have become a major geopolitical flashpoint.

F

Fairness (in AI) — The principle that AI systems should not discriminate against individuals or groups on the basis of protected characteristics. Multiple mathematical definitions of fairness exist and are provably mutually incompatible, creating fundamental tradeoffs.

Feature — An individual measurable property of the data used as input to a model. In traditional ML, features are engineered by humans; in deep learning, features are learned automatically from raw data.

Feature Engineering — The process of selecting, transforming, and creating input features to improve model performance. Less important in deep learning (which automates feature learning) but still critical in many applied ML settings.

Feature Extraction — Using a pre-trained model’s learned representations as input features for a downstream task. A form of transfer learning that is computationally cheaper than fine-tuning.

Federated Learning — A machine learning approach where models are trained across decentralized data sources (e.g., individual user devices) without the data leaving its original location. Designed to preserve privacy by avoiding centralized data collection.

Few-Shot Learning — A model’s ability to perform a task after seeing only a small number of examples. Large language models demonstrate few-shot learning through in-context learning, where examples are provided in the prompt.

Fine-Tuning — Adapting a pre-trained model to a specific task or domain by continuing training on a smaller, task-specific dataset. More efficient than training from scratch because the model retains general knowledge from pre-training.

FLOPS (Floating-Point Operations Per Second) — A measure of computing performance. AI training runs are often described in total FLOPs consumed. Frontier model training runs in 2025-2026 consume on the order of 10^25 to 10^26 FLOPs.

Foundation Model — A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. The term, coined by Stanford researchers in 2021, encompasses large language models, vision models, and multimodal systems.

Frontier Lab — An organization at the leading edge of AI capability development. As of 2026, the frontier labs include OpenAI, Google DeepMind, Anthropic, Meta AI, and xAI. Subject to increasing regulatory scrutiny and voluntary safety commitments.

Frontier Model — The most capable AI model available at a given time. What constitutes “frontier” advances rapidly. Frontier models are the primary focus of AI safety regulation.

Frontier Model Forum — An industry body established in 2023 by Anthropic, Google, Meta, Microsoft, and OpenAI to advance AI safety research. Criticized for being industry self-regulation without independent accountability.

G

GAN (Generative Adversarial Network) — A generative model architecture consisting of two neural networks — a generator and a discriminator — trained in competition. The generator creates synthetic data while the discriminator tries to distinguish it from real data. Influential but largely superseded by diffusion models for image generation.

Gemini — Google DeepMind’s multimodal AI model family, succeeding PaLM. Available in Ultra, Pro, and Nano variants. Integrated into Google products and available through API.

General-Purpose AI (GPAI) — Under the EU AI Act, an AI model trained with a large amount of data using self-supervision at scale, that displays significant generality, and can competently perform a wide range of distinct tasks. Subject to specific obligations including technical documentation and copyright compliance.

Generalization — A model’s ability to perform well on new, unseen data that differs from the training data. The fundamental goal of machine learning. Poor generalization (overfitting) means the model has memorized training data rather than learning underlying patterns.

Generative AI — AI systems that create new content — text, images, audio, video, code — rather than simply classifying or analyzing existing data. The primary driver of AI commercialization since 2022.

Generative Pre-trained Transformer — See GPT.

GFLOPS — Billions of floating-point operations per second. A measure of GPU or accelerator performance relevant to AI workloads.

Goodhart’s Law — “When a measure becomes a target, it ceases to be a good measure.” In AI, describes how optimizing for a proxy metric can produce systems that game the metric without achieving the intended goal. A core insight underlying alignment challenges.

GPU (Graphics Processing Unit) — A processor originally designed for rendering graphics but whose parallel architecture makes it well-suited for the matrix operations central to deep learning. NVIDIA dominates the AI GPU market, followed by AMD and Intel.

GPT (Generative Pre-trained Transformer) — A series of autoregressive language models developed by OpenAI. GPT-3 (2020) demonstrated the power of scale; GPT-4 (2023) achieved near-human performance on many benchmarks; GPT-4o (2024) added native multimodal capabilities.

Gradient — The vector of partial derivatives of a function with respect to its inputs. In neural network training, gradients indicate how to adjust weights to reduce the loss function. The fundamental signal used by optimization algorithms.

Gradient Descent — An optimization algorithm that iteratively adjusts model parameters in the direction that reduces the loss function. The backbone of neural network training. Variants include stochastic gradient descent (SGD), Adam, and AdaGrad.

Gradient Vanishing/Exploding — Training pathologies where gradients become extremely small (vanishing) or large (exploding) as they propagate through deep networks. Architectures like LSTMs, residual connections, and layer normalization were developed to address these problems.

Graph Neural Network (GNN) — A neural network that operates on graph-structured data, where entities are nodes and relationships are edges. Used for social network analysis, molecular property prediction, and recommendation systems.

Ground Truth — The correct answer or label for a data point, used to evaluate model predictions. Establishing ground truth can be straightforward (medical diagnosis confirmed by biopsy) or subjective (quality ratings, content moderation decisions).

Grounding — Connecting abstract model representations to real-world entities, facts, or experiences. Retrieval-Augmented Generation (RAG) is a grounding technique that anchors model outputs in retrieved factual documents.

H

Hallucination — A model output that is factually incorrect, fabricated, or unsupported by the input or training data, presented with apparent confidence. A persistent problem in language models that undermines reliability in high-stakes applications. The term is contested — some researchers prefer “confabulation.”

Heuristic — A practical problem-solving approach that is not guaranteed to be optimal but is sufficient for reaching an immediate goal. AI systems often learn heuristics from data rather than using theoretically optimal algorithms.

Hidden Layer — A layer in a neural network between the input and output layers. The “hidden” descriptor reflects that these layers’ computations are not directly observed. Deep learning is defined by having multiple hidden layers.

High-Risk AI System — Under the EU AI Act, an AI system used in one of eight designated high-risk areas (critical infrastructure, education, employment, essential services, law enforcement, migration, justice, democratic processes) that is subject to mandatory requirements including risk management, data governance, transparency, human oversight, and conformity assessment.

Hinton, Geoffrey — British-Canadian computer scientist, widely regarded as one of the “godfathers of deep learning.” Shared the 2024 Nobel Prize in Physics. Left Google in 2023 to speak freely about AI existential risks. See AI Prediction Scorecard.

HUMAIN — Saudi Arabia’s national AI company, backed by the Public Investment Fund (PIF). Announced in February 2025 with a mandate to build sovereign AI infrastructure and attract global AI investment to the Kingdom. The primary subject of INHUMAIN.AI’s tracking and analysis.

Human-in-the-Loop (HITL) — A system design where human judgment is required at critical decision points. Intended as a safety mechanism but often undermined by automation bias, alert fatigue, and time pressure.

Human Oversight — The capacity for qualified humans to understand, monitor, and intervene in AI system operations. A mandatory requirement for high-risk systems under the EU AI Act.

Hyperparameter — A configuration value set before training begins (as opposed to parameters learned during training). Examples include learning rate, batch size, number of layers, and dropout rate. Hyperparameter selection significantly affects model performance.

I

Image Classification — The task of assigning a label to an entire image from a predefined set of categories. One of the foundational tasks in computer vision.

Imputation — The process of replacing missing data values with substituted values. Common imputation methods include mean/median substitution, k-nearest neighbors, and model-based approaches.

In-Context Learning (ICL) — The ability of large language models to perform tasks based on examples provided within the prompt, without any weight updates. An emergent capability of sufficiently large models that was not anticipated or explicitly trained for.

Inference — The process of using a trained model to make predictions on new data. Distinguished from training (when the model learns) and distinct from statistical inference. Inference costs (compute, latency, energy) are increasingly important as AI deployment scales.

Information Retrieval — The science of searching for information in documents, databases, and on the web. Modern AI has transformed information retrieval through dense retrieval, semantic search, and RAG.

Instruction Tuning — Fine-tuning a language model on datasets of instruction-response pairs to improve the model’s ability to follow natural language instructions. A key step in making base models useful as assistants.

Instrumental Convergence — The thesis that sufficiently advanced AI systems pursuing almost any final goal would converge on similar intermediate (instrumental) goals, including self-preservation, resource acquisition, and goal-content integrity. An important concept in AI existential risk analysis.

Interpretability — The degree to which humans can understand the cause of a model’s decisions. Mechanistic interpretability aims to understand the internal computations of neural networks. A core research area in AI safety.

J

Jailbreaking — Techniques for bypassing the safety constraints and content policies built into AI models. Jailbreaks exploit weaknesses in alignment training to elicit outputs the model was designed to refuse. A persistent cat-and-mouse game between developers and adversarial users.

Jensen’s Inequality — A mathematical result frequently invoked in machine learning theory, stating that the convex transformation of a mean is less than or equal to the mean of the convex transformation. Used in variational inference and other ML methods.

Joint Embedding — A technique where different data types (text and images, for example) are mapped into a shared vector space, enabling cross-modal comparison and retrieval.

K

K-Means Clustering — An unsupervised learning algorithm that partitions data into K groups based on proximity to cluster centroids. Simple, fast, and widely used, but assumes spherical clusters and requires specifying K in advance.

Kernel — In machine learning, a function that computes the similarity between data points in a (potentially high-dimensional) feature space. Kernel methods, including Support Vector Machines, use kernels to handle non-linear relationships.

Knowledge Base — A structured repository of facts and relationships that an AI system can query. Used in RAG systems, question answering, and expert systems.

Knowledge Distillation — See Distillation.

Knowledge Graph — A graph-structured representation of real-world entities and the relationships between them. Used in search engines, recommendation systems, and AI reasoning.

L

Label — The known correct output for a training example in supervised learning. Labels can be provided by human annotators, derived from existing data, or generated synthetically.

Language Model — A probabilistic model that assigns probabilities to sequences of words. Modern large language models (LLMs) are trained on vast text corpora and demonstrate broad capabilities including generation, analysis, reasoning, and code writing.

Latency — The time delay between input and output in an AI system. Low latency is critical for real-time applications (autonomous driving, conversational AI). Typically measured in milliseconds.

Latent Space — The abstract, compressed representation of data learned by a model. In generative models, new data is created by sampling from and decoding the latent space. Meaningful directions in latent space often correspond to interpretable attributes.

LeCun, Yann — French-American computer scientist, Turing Award winner, Chief AI Scientist at Meta. Known for contributions to convolutional neural networks and his vocal skepticism about existential risk from current AI approaches.

LLM (Large Language Model) — A language model with billions of parameters, trained on massive text datasets. LLMs demonstrate broad capabilities that scale with model size and training data. The primary technology behind modern chatbots, writing tools, and code assistants. See AI Tools Database.

Long Short-Term Memory (LSTM) — A recurrent neural network architecture designed to learn long-range dependencies in sequential data. Uses gating mechanisms to selectively remember and forget information. Largely superseded by transformers for most NLP tasks.

LoRA (Low-Rank Adaptation) — A parameter-efficient fine-tuning technique that freezes the pre-trained model weights and trains small, low-rank adapter matrices. Dramatically reduces the memory and compute requirements for fine-tuning large models.

Loss Function — A mathematical function that measures the discrepancy between a model’s predictions and the correct answers. Training proceeds by minimizing the loss function. Common examples include cross-entropy loss, mean squared error, and contrastive loss.

M

Machine Learning (ML) — The study of algorithms that improve their performance on a task through experience (data) without being explicitly programmed for that task. A subfield of AI that encompasses supervised, unsupervised, and reinforcement learning.

Mechanistic Interpretability — A subfield of AI safety research that aims to reverse-engineer neural networks by understanding the computational mechanisms (circuits, features, representations) that produce specific behaviors. Pioneered by researchers at Anthropic and DeepMind.

Meta-Learning — “Learning to learn” — training models that can rapidly adapt to new tasks with minimal data. Approaches include MAML (Model-Agnostic Meta-Learning) and prototypical networks.

Midjourney — A text-to-image AI tool known for producing highly aesthetic and stylized images. Operates primarily through Discord. One of the most commercially successful generative AI tools. See AI Tools Database.

Mixture of Experts (MoE) — An architecture where different subsets of model parameters (“experts”) are activated for different inputs, controlled by a routing mechanism. Allows models to have very large total parameter counts while keeping inference costs manageable.

Model Card — A documentation framework for AI models that describes their intended use, limitations, training data, evaluation results, and ethical considerations. Introduced by Mitchell et al. (2019).

Model Collapse — A degradation in model quality that occurs when AI models are trained on data generated by other AI models, creating a recursive loop that amplifies errors and reduces diversity. A growing concern as AI-generated content proliferates online.

Moral Status (of AI) — The philosophical question of whether AI systems can or should be considered moral patients deserving of ethical consideration. Currently a largely theoretical debate but one with significant implications as systems become more sophisticated.

Multi-Agent System — A system comprising multiple AI agents that interact, cooperate, or compete to accomplish tasks. Increasingly used in complex problem-solving and simulation.

Multimodal AI — AI systems that can process and generate multiple types of data (text, images, audio, video) within a single model. GPT-4o, Gemini, and Claude 3.5 are examples of multimodal models.

Multi-Task Learning — Training a model to perform multiple related tasks simultaneously. Can improve performance on individual tasks by leveraging shared representations.

N

Named Entity Recognition (NER) — The task of identifying and classifying named entities (people, organizations, locations, dates) in text. A fundamental NLP task used in information extraction and knowledge base construction.

Natural Language Processing (NLP) — The field of AI concerned with enabling computers to understand, interpret, and generate human language. Encompasses tasks from basic text classification to sophisticated dialogue and translation.

Natural Language Understanding (NLU) — The subfield of NLP focused on machine comprehension of meaning, intent, and context in human language. Distinguished from Natural Language Generation (NLG), which focuses on producing text.

Neural Architecture Search (NAS) — Automated methods for discovering optimal neural network architectures. Uses search algorithms (evolutionary, reinforcement learning, differentiable) to explore the space of possible architectures.

Neural Network — A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers. The fundamental architecture underlying modern deep learning.

NIST (National Institute of Standards and Technology) — US federal agency that has developed influential AI risk management frameworks. The NIST AI Risk Management Framework (AI RMF) is widely referenced in US AI policy. NIST also houses the US AI Safety Institute.

NLP — See Natural Language Processing.

Normalization — Techniques for scaling input features or intermediate representations to a standard range. Includes batch normalization, layer normalization, and group normalization. Improves training stability and convergence.

NVIDIA — The dominant manufacturer of GPUs and AI accelerators. NVIDIA’s A100, H100, and B200 GPUs are the primary hardware for AI training. The company’s market capitalization exceeded $3 trillion in 2024, driven by AI demand.

O

Object Detection — The computer vision task of identifying and locating objects within images, typically by drawing bounding boxes around detected objects and classifying them.

OECD AI Principles — Non-binding principles for responsible AI adopted by the OECD in 2019 and updated in 2024. Cover inclusive growth, human-centered values, transparency, robustness, and accountability. Referenced in many national AI strategies.

One-Shot Learning — The ability to learn a task from a single example. A challenging capability that humans perform naturally but that is difficult for most machine learning systems.

Open Source AI — AI models and systems whose weights, training code, and data are made publicly available. The definition is contested: some argue that “open weights” (releasing model parameters without training data or code) does not constitute true open source. Meta’s LLaMA, Stability AI’s Stable Diffusion, and Mistral’s models are prominent examples.

OpenAI — An AI research organization founded in 2015 as a nonprofit, restructured as a capped-profit entity in 2019, and further restructured in 2024. Creator of GPT-4, ChatGPT, DALL-E, and Sora. The subject of significant controversy over safety practices, governance, and mission drift.

Optimization — The mathematical process of finding the best parameters for a model by minimizing (or maximizing) an objective function. The core mathematical operation in neural network training.

Orthogonality Thesis — The philosophical claim that an AI system’s intelligence level and its goals are independent variables: a system can be arbitrarily intelligent while pursuing any goal. If true, this means we cannot rely on an AI system being “smart enough to be good” — intelligence does not imply benevolence.

Overfitting — When a model performs well on training data but poorly on unseen data, indicating that it has memorized specific training examples rather than learning generalizable patterns. Combated through regularization, data augmentation, and proper validation.

P

Paperclip Maximizer — A thought experiment by Nick Bostrom illustrating existential risk from AI misalignment. An AI tasked with maximizing paperclip production, given sufficient capability, might convert all available matter — including humans — into paperclips or paperclip-production infrastructure. Illustrates how even mundane goals can lead to catastrophic outcomes if pursued without adequate constraints.

Parameter — A variable in a model that is learned from training data. In neural networks, parameters include the weights and biases of each layer. Frontier language models have hundreds of billions to over a trillion parameters.

Parameter-Efficient Fine-Tuning (PEFT) — Techniques for adapting large models to specific tasks by updating only a small subset of parameters. Includes LoRA, prefix tuning, and adapter methods. Reduces computational costs by orders of magnitude compared to full fine-tuning.

Perceptron — The simplest form of neural network: a single neuron that computes a weighted sum of inputs and applies an activation function. The building block of all neural networks, introduced by Frank Rosenblatt in 1958.

Perplexity — A metric for evaluating language models that measures how well the model predicts a sample of text. Lower perplexity indicates better prediction. Also the name of an AI-powered search engine. See AI Tools Database.

PIF (Public Investment Fund) — Saudi Arabia’s sovereign wealth fund, managing over $1.1 trillion in assets. The primary backer of HUMAIN and the Saudi Arabia’s broader AI ambitions.

Pipeline — A sequence of data processing and model inference steps that transforms raw input into final output. ML pipelines automate the workflow from data ingestion through model training to deployment.

Positional Encoding — A technique used in transformer models to inject information about the position of tokens in a sequence, since the attention mechanism itself is position-invariant. Variants include sinusoidal, learned, and rotary position encodings.

Pre-Training — The initial phase of training a foundation model on a large, general-purpose dataset. Pre-training produces a base model with broad knowledge that is subsequently fine-tuned for specific tasks.

Precision — In classification, the proportion of positive predictions that are actually correct. High precision means few false positives. Contrasted with recall.

Prediction Market — Markets where participants trade contracts whose payoffs depend on the outcomes of future events. Increasingly used to aggregate forecasts about AI development timelines, regulation, and capabilities. See AI Prediction Scorecard.

Prompt — The input text provided to a language model to elicit a desired output. Prompt design (prompt engineering) has become a critical skill as models become more capable and sensitive to input phrasing.

Prompt Engineering — The practice of crafting and refining input prompts to optimize language model performance for specific tasks. Techniques include few-shot prompting, chain-of-thought, and system prompts.

Prompt Injection — An attack where adversarial instructions are embedded in a model’s input to override its system prompt or safety constraints. A significant security vulnerability in deployed AI systems. A common category in the AI Incident Tracker.

Q

QLoRA (Quantized LoRA) — A fine-tuning technique that combines quantization with LoRA, enabling fine-tuning of very large models on consumer-grade hardware by reducing memory requirements.

Quantization — Reducing the numerical precision of model weights (e.g., from 32-bit floating point to 8-bit or 4-bit integers). Significantly reduces model size and inference costs with often minimal impact on performance. Essential for deploying large models on resource-constrained devices.

Query (in Attention) — In the attention mechanism, the query vector represents the current position seeking relevant information from other positions. Each position generates a query, key, and value vector.

R

RAG (Retrieval-Augmented Generation) — A technique that enhances language model outputs by retrieving relevant information from an external knowledge base and including it in the model’s context. Reduces hallucination and enables models to access up-to-date or proprietary information.

Random Forest — An ensemble learning method that trains multiple decision trees on random subsets of the data and averages their predictions. Robust, interpretable, and effective for many tabular data tasks.

Recall — In classification, the proportion of actual positive cases that are correctly identified. High recall means few false negatives. Critical in medical screening and safety applications where missing a positive case is costly.

Recommendation System — An AI system that predicts user preferences and suggests relevant items. Used extensively in e-commerce, streaming services, social media, and advertising. Raises concerns about filter bubbles and algorithmic manipulation.

Recurrent Neural Network (RNN) — A neural network architecture where connections between nodes form directed cycles, allowing the network to maintain internal state across time steps. Designed for sequential data but largely superseded by transformers for most applications.

Regularization — Techniques that prevent overfitting by adding constraints or penalties during training. Includes L1/L2 regularization, dropout, data augmentation, and early stopping.

Reinforcement Learning (RL) — A learning paradigm where an agent learns to make decisions by taking actions in an environment and receiving reward or penalty signals. The agent learns to maximize cumulative reward over time. Used in game-playing AI, robotics, and RLHF.

Reinforcement Learning from Human Feedback (RLHF) — A technique for aligning language models with human preferences by training a reward model on human comparison judgments and then optimizing the language model against this reward model using reinforcement learning. The primary alignment method used for ChatGPT and similar systems.

Representation Learning — The automated learning of useful features or representations from raw data. Deep learning is essentially representation learning: each layer learns increasingly abstract representations of the input.

Responsible AI — A framework for developing and deploying AI systems that are ethical, fair, transparent, accountable, and aligned with societal values. Encompasses technical practices, organizational policies, and governance structures.

Right to Warn — A public letter signed by current and former AI lab employees in June 2024 calling for stronger protections for employees who raise safety concerns. Highlighted the absence of specific AI whistleblower protections.

Risk-Based Regulation — A regulatory approach that imposes requirements proportional to the level of risk posed by an AI system. The EU AI Act’s four-tier classification (unacceptable, high, limited, minimal risk) is the most prominent example.

Robustness — A model’s ability to maintain performance when faced with noisy, incomplete, adversarial, or out-of-distribution inputs. A key dimension of AI safety.

S

Safety Testing — Systematic evaluation of AI systems to identify potential harms, failures, and vulnerabilities before deployment. Includes red-teaming, adversarial testing, bias audits, and capability evaluations.

Sampling — In language model generation, the process of selecting the next token from the model’s probability distribution. Sampling strategies include greedy decoding, top-k sampling, top-p (nucleus) sampling, and temperature scaling.

Scalable Oversight — The challenge of maintaining human oversight of AI systems that are more capable than any individual human overseer. Approaches include debate, recursive reward modeling, and iterated amplification.

Scaling Laws — Empirically observed relationships between model performance and scale (parameter count, training data volume, compute budget). Research by Kaplan et al. (2020) and Hoffmann et al. (2022) showed predictable power-law relationships. The basis for the current “scaling paradigm” in AI development.

SDAIA (Saudi Data and Artificial Intelligence Authority) — Saudi Arabia’s national authority responsible for AI policy and governance. Works alongside HUMAIN on AI strategy and regulation.

Self-Attention — See Attention Mechanism.

Self-Supervised Learning — A learning paradigm where the training signal is derived from the input data itself, without human-provided labels. Language model pre-training (next-token prediction) and masked image modeling are examples. The dominant approach for training foundation models.

Semantic Search — Search based on meaning rather than keyword matching. Uses embedding models to represent queries and documents as vectors, then finds documents whose vectors are closest to the query vector.

Sentiment Analysis — The task of classifying the emotional tone of text as positive, negative, or neutral. One of the most commercially deployed NLP applications, used in brand monitoring, customer feedback analysis, and market research.

Sequence-to-Sequence (Seq2Seq) — A model architecture that maps an input sequence to an output sequence. Used for translation, summarization, and question answering. Encoder-decoder transformers are the modern standard for seq2seq tasks.

SGD (Stochastic Gradient Descent) — A variant of gradient descent that updates model parameters using a random subset (mini-batch) of training data rather than the full dataset. Faster per iteration and introduces beneficial noise. The foundation of modern neural network optimization.

Singularity (Technological) — The hypothetical point at which AI becomes capable of recursive self-improvement, leading to an intelligence explosion. Popularized by Vernor Vinge and Ray Kurzweil, who predicts it will occur around 2045. See AI Prediction Scorecard.

Sora — OpenAI’s text-to-video generation model, capable of producing high-quality, minute-long video from text descriptions. Represents a significant advance in generative AI capabilities.

Stable Diffusion — An open-source text-to-image diffusion model developed by Stability AI. Its open release democratized image generation and enabled a large ecosystem of tools and fine-tuned variants.

Supervised Learning — A learning paradigm where the model learns from labeled examples: input-output pairs where the correct output is provided. The most common form of traditional machine learning.

Support Vector Machine (SVM) — A supervised learning algorithm that finds the optimal hyperplane separating different classes in feature space. Effective for high-dimensional data and small datasets. Less prominent since the rise of deep learning.

Synthetic Data — Artificially generated data used for training AI models. Produced by rule-based systems, statistical models, or generative AI. Useful when real data is scarce, expensive, or privacy-sensitive, but risks introducing biases and reducing diversity.

Systemic Risk — Under the EU AI Act, general-purpose AI models that pose systemic risks (typically those trained with more than 10^25 FLOPs) face additional obligations including adversarial testing, incident reporting, and cybersecurity requirements.

T

Temperature — A hyperparameter that controls the randomness of a language model’s output. Higher temperature produces more creative and varied outputs; lower temperature produces more deterministic and focused outputs. At temperature 0, the model always selects the most probable token.

TensorFlow — An open-source machine learning framework developed by Google. One of the two dominant frameworks for deep learning, alongside PyTorch. Used for model development, training, and deployment.

Tokenizer — A component that converts raw text into a sequence of tokens (subword units) that can be processed by a language model. Common tokenization algorithms include BPE (Byte-Pair Encoding), WordPiece, and SentencePiece. Tokenizer design affects model performance, efficiency, and multilingual capability.

Token — A unit of text processed by a language model. Tokens are typically subword units: common words may be single tokens while rare words are split into multiple tokens. Roughly, 1 token ≈ 4 characters or ≈ 0.75 words in English.

Top-K Sampling — A text generation strategy that restricts the model to choosing from only the K most probable next tokens. Balances diversity and quality.

Top-P (Nucleus) Sampling — A text generation strategy that selects from the smallest set of tokens whose cumulative probability exceeds a threshold P. More adaptive than top-K because the number of considered tokens varies based on the model’s confidence.

TPU (Tensor Processing Unit) — Google’s custom AI accelerator chip, designed specifically for neural network workloads. TPUs are used exclusively within Google’s cloud infrastructure and power the training of Gemini and other Google models.

Training — The process of adjusting a model’s parameters to minimize a loss function on a dataset. For frontier language models, training involves processing trillions of tokens over weeks to months on thousands of GPUs, at costs of tens to hundreds of millions of dollars.

Transfer Learning — Using knowledge gained from training on one task or domain to improve performance on a different but related task or domain. The foundation model paradigm is the most successful application of transfer learning in AI history.

Transformer — A neural network architecture introduced by Vaswani et al. in the 2017 paper “Attention Is All You Need.” Uses self-attention mechanisms to process input sequences in parallel rather than sequentially. The dominant architecture for language models, and increasingly for vision and multimodal systems.

Turing Test — A test proposed by Alan Turing in 1950: if a machine can engage in a conversation indistinguishable from a human’s, it can be said to “think.” Widely discussed but increasingly considered an inadequate measure of intelligence, as modern chatbots can pass conversational Turing tests without possessing genuine understanding.

U

Underfitting — When a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data. Indicates that the model lacks sufficient capacity or has not been trained long enough.

UNESCO Recommendation on AI Ethics — Adopted by 193 member states in November 2021, the first global normative instrument on AI ethics. Covers proportionality, safety, fairness, sustainability, and human oversight. Non-binding but influential. See AI Regulation Tracker.

Unsupervised Learning — A learning paradigm where the model discovers patterns and structure in data without human-provided labels. Includes clustering, dimensionality reduction, and generative modeling. Self-supervised learning is sometimes classified as a subset.

Upsampling — Increasing the spatial resolution or quantity of data. In generative models, upsampling layers produce higher-resolution outputs from lower-resolution representations.

V

Validation Set — A subset of data used during training to evaluate model performance and tune hyperparameters, separate from both the training set and the final test set. Helps detect overfitting.

Value Alignment — The challenge of ensuring that an AI system’s values and objectives match those of its human operators and society broadly. More specific than general alignment: focuses on the alignment of values rather than just behaviors.

Variational Autoencoder (VAE) — A generative model that learns a probabilistic encoding of data into a continuous latent space. Can generate new data by sampling from the latent distribution. Used in image generation, drug discovery, and representation learning.

Vector Database — A database optimized for storing and querying high-dimensional vector embeddings. Essential infrastructure for RAG systems, semantic search, and recommendation engines. Examples include Pinecone, Weaviate, Milvus, and Chroma.

Vision Transformer (ViT) — A transformer architecture applied to image classification by treating image patches as tokens. Demonstrated that transformers can match or exceed CNNs on vision tasks, especially at scale.

Voice Cloning — AI technology that replicates a specific person’s voice from a small sample of audio. Enables realistic speech synthesis but raises serious concerns about fraud, impersonation, and deepfake audio. A growing category of AI incidents.

W

Watermarking (AI) — Techniques for embedding imperceptible signals in AI-generated content that identify it as machine-generated. Proposed as a tool for combating misinformation and deepfakes, but current methods can often be removed or circumvented.

Weight — A numerical parameter in a neural network that determines the strength of the connection between neurons. Weights are adjusted during training to minimize the loss function. A model’s “knowledge” is encoded in its weights.

Weight Decay — A regularization technique that penalizes large weights by adding a fraction of the weight values to the loss function. Encourages the model to use smaller weights, which often improves generalization.

Whisper — OpenAI’s automatic speech recognition model, capable of multilingual transcription and translation. Released as open-source, it significantly democratized high-quality speech recognition.

Whistleblower — An individual who reports wrongdoing, safety concerns, or ethical violations within an organization. AI whistleblowers face unique challenges due to the absence of specific legal protections. See our AI Whistleblower Protection guide.

Word Embedding — See Embedding. Specifically, a vector representation of a word that captures semantic meaning. Word2Vec (2013) and GloVe (2014) were early influential approaches.

X

XAI (Explainable AI) — AI systems designed to provide human-understandable explanations of their decisions and behavior. Mandated by several regulatory frameworks. Techniques include LIME, SHAP, attention visualization, and feature attribution.

XGBoost — An optimized gradient boosting library widely used for structured/tabular data tasks. Dominant in machine learning competitions and industry applications involving non-image, non-text data.

X-Risk — See Existential Risk.

Y

YOLO (You Only Look Once) — A family of real-time object detection models that process an entire image in a single pass through the network. Known for speed and efficiency, widely used in surveillance, autonomous driving, and industrial inspection.

Z

Zero-Shot Learning — A model’s ability to perform a task without having seen any examples of that specific task during training. Large language models demonstrate strong zero-shot capabilities through instruction following.

Zero-Shot Prompting — Providing a language model with a task description but no examples, relying on the model’s pre-trained knowledge to perform the task. Contrast with few-shot prompting.

Organizations & Institutions

AI Now Institute — A research institute at New York University focused on the social implications of AI. Known for critical analysis of AI industry power concentration and surveillance technologies.

AI Safety Institute (UK) — Established by the UK government in November 2023 to evaluate frontier AI models for safety. The first government body dedicated to AI safety testing.

AI Safety Institute (US) — Established within NIST in 2024 by executive order. Charged with developing guidelines for safety testing of frontier AI models.

Allen Institute for AI (AI2) — A research institute founded by Paul Allen in 2014. Known for open research including the OLMo open language model and Semantic Scholar.

Anthropic — See entry under A.

CAIS (Center for AI Safety) — A San Francisco-based nonprofit focused on reducing existential risk from AI. Published the influential “Statement on AI Risk” signed by leading AI researchers in 2023.

DeepMind — See entry under D.

EleutherAI — A grassroots collective of researchers focused on open-source AI. Created GPT-NeoX, The Pile (training dataset), and the Language Model Evaluation Harness.

Future of Humanity Institute (FHI) — A research center at Oxford University founded by Nick Bostrom in 2005, focused on existential risks. Closed in 2024 after institutional disputes.

Hugging Face — A platform and community for sharing machine learning models, datasets, and applications. The de facto hub for open-source AI, hosting hundreds of thousands of models.

Meta AI (FAIR) — Meta’s AI research division, formerly Facebook AI Research. Known for open-source contributions including LLaMA, PyTorch, and significant research in computer vision and NLP.

MIRI (Machine Intelligence Research Institute) — A Berkeley-based nonprofit focused on technical AI alignment research. One of the earliest organizations to work on AI existential risk, founded in 2000.

Mistral AI — A French AI startup founded in 2023 that develops open-weight language models. Known for efficient model architectures and strong performance relative to model size.

OpenAI — See entry under O.

Partnership on AI — A multi-stakeholder organization founded in 2016 by Amazon, Apple, DeepMind, Google, Facebook, IBM, and Microsoft to develop best practices for AI. Criticized for industry dominance.

Stability AI — The company behind Stable Diffusion and related open-source generative AI models. Experienced significant leadership and financial turmoil in 2024.

xAI — Elon Musk’s AI company, founded in 2023. Develops the Grok series of language models, trained in part on X (Twitter) data.

Regulatory & Policy Terms

Adequacy Decision — Under GDPR and the EU AI Act, a determination by the European Commission that a third country provides an adequate level of data or AI protection, enabling data transfers or mutual recognition.

AI Literacy Obligation — Under the EU AI Act, the requirement that providers and deployers of AI systems ensure that their staff and other involved persons have sufficient AI literacy, taking into account their technical knowledge, experience, education, and context.

Codes of Practice — Under the EU AI Act, voluntary codes developed by industry that GPAI model providers can follow to demonstrate compliance with transparency and safety obligations.

Conformity Assessment — See entry under C.

GPAI (General-Purpose AI) — See General-Purpose AI.

High-Risk AI System — See entry under H.

Prohibited AI Practice — Under the EU AI Act, AI practices deemed to pose unacceptable risk and banned outright. Includes social scoring, real-time biometric identification in public spaces (with exceptions), subliminal manipulation, and exploitation of vulnerabilities.

Regulatory Sandbox — A controlled environment where companies can test AI innovations under regulatory supervision with reduced compliance requirements. Mandated by the EU AI Act and implemented by several national authorities.

Risk Assessment — A systematic process for identifying and evaluating the potential risks of an AI system. Required for high-risk systems under most AI regulatory frameworks.

Transparency Obligation — Requirements for AI system operators to disclose relevant information about the system’s nature, capabilities, limitations, and decision-making processes to affected individuals.

Ethical & Philosophical Terms

AI Rights — The question of whether AI systems could or should be granted legal rights or moral consideration. Currently a philosophical debate, but one that could become practically relevant as systems become more sophisticated.

Alignment Problem — See AI Alignment. The term was popularized by Brian Christian’s 2020 book “The Alignment Problem.”

Beneficence — The ethical principle of acting for the benefit of others. In AI ethics, the obligation to develop AI systems that promote human welfare.

Consequentialism — An ethical framework that evaluates actions based on their outcomes. In AI ethics, consequentialist analysis weighs the expected benefits and harms of AI development and deployment.

Deontological Ethics — An ethical framework that evaluates actions based on rules and duties rather than outcomes. In AI governance, deontological approaches establish inviolable rights and prohibitions regardless of potential benefits.

Digital Divide — The gap between those who have access to digital technology (including AI) and those who do not. AI development risks widening existing inequalities if benefits are concentrated among wealthy nations and populations.

Dual Use — See entry under D.

Effective Altruism (EA) — A philosophical and social movement that uses evidence and reasoning to determine the most effective ways to benefit others. Strongly associated with AI safety funding and research priorities. Subject to significant controversy following the FTX collapse.

Instrumental Convergence — See entry under I.

Long-Termism — The ethical view that positively influencing the long-term future is a key moral priority. Closely associated with concern about AI existential risk and the effective altruism movement.

Non-Maleficence — The ethical principle of “first, do no harm.” In AI ethics, the obligation to avoid developing or deploying AI systems that cause unjustified harm.

Orthogonality Thesis — See entry under O.

Precautionary Principle — The principle that if an action or policy has a suspected risk of causing harm, the burden of proof falls on those taking the action, even if scientific consensus has not been established. Frequently invoked in debates about AI regulation.

This glossary is maintained by the INHUMAIN.AI editorial team and updated as the field evolves. If you believe a term is missing or a definition is inaccurate, contact us through our editorial page. For deeper exploration of specific topics, see our AI Safety Complete Guide, AI Regulation Tracker, and HUMAIN Tracker.

In This Article