🏠 Home
Benchmark
📊 Tutti i benchmark 🦖 Dinosauro v1 🦖 Dinosauro v2 ✅ App To-Do List 🎨 Pagine libere creative 🎯 FSACB - Ultimate Showcase 🌍 Benchmark traduzione
Modelli
🏆 Top 10 modelli 🆓 Modelli gratuiti 📋 Tutti i modelli ⚙️ Kilo Code
Risorse
💬 Libreria di prompt 📖 Glossario IA 🔗 Link utili

Glossario IA

Il dizionario completo dell'Intelligenza Artificiale

162
categorie
2.032
sottocategorie
23.060
termini
📖
termini

RMSprop

Adaptive optimization technique that divides the learning rate by an exponential moving average of the squares of recent gradients to handle large-magnitude gradients.

📖
termini

Adagrad

Adaptive optimization algorithm that adapts the learning rate of each parameter by accumulating the squares of historical gradients, favoring infrequent parameters.

📖
termini

Adadelta

Extension of Adagrad that solves the problem of the learning rate's drastic decay by limiting the window of past gradients to a fixed size via an exponential moving average.

📖
termini

Adamax

Variant of Adam based on the infinity norm instead of the L2 norm, offering greater numerical stability and more robust convergence in some scenarios.

📖
termini

Nadam

Combination of Nesterov accelerated gradient and Adam that incorporates Nesterov's acceleration into Adam's adaptive framework for faster and more stable convergence.

📖
termini

AMSGrad

Modification of Adam that guarantees theoretical convergence by retaining the maximum of the exponential moving averages of the squared gradients to avoid Adam's potential divergences.

📖
termini

AdamW

Variant of Adam that decouples weight decay from the adaptive update, applying decay directly to the weights rather than to the gradients.

📖
termini

SGDW

Extension of SGD with decoupled weight decay that applies weight decay independently of the gradient update for better regularization.

📖
termini

RAdam

Rectified Adam that solves the problem of high variance in the initial training phases by introducing an adaptive rectification mechanism.

📖
termini

YellowFin

Optimizer that automatically adjusts the learning rate and momentum coefficient using a theoretical analysis of the local convergence of second-order methods.

📖
termini

LARS

Layer-wise Adaptive Rate Scaling that adapts the learning rate per layer based on the ratio between the L2 norm of weights and gradients for large-scale training.

📖
termini

LAMB

Layer-wise Adaptive Moments optimizer for Batch training that extends LARS by integrating Adam-type adaptive statistics for efficient training of massive models.

📖
termini

Rprop

Resilient Backpropagation that adapts the learning rate per parameter by ignoring the magnitude of the gradient and considering only its sign for robust updates.

📖
termini

QHAdam

Quasi-Hyperbolic Adam that generalizes Adam and Momentum by introducing quasi-hyperbolicity parameters for fine control of the moment contributions.

🔍

Nessun risultato trovato