Marcus Edel public
[search 0]
More
Download the App!
show episodes
 
Loading …
show series
 
LLM-FP4: 4-Bit Floating-Point Quantized Transformers Detecting Pretraining Data from Large Language Models ConvNets Match Vision Transformers at Scale A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models Support the show…
  continue reading
 
Matryoshka Diffusion Models Dissecting In-Context Learning of Translations in GPTs Woodpecker: Hallucination Correction for Multimodal Large Language Models SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding Support the showBy Marcus Edel
  continue reading
 
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models Localizing and Editing Knowledge in Text-to-Image Generative Models Support the show…
  continue reading
 
H2O Open Ecosystem for State-of-the-art Large Language Models Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models Teaching Language Models to Self-Improve through Interactive Demonstrations Support the showBy Marcus Edel
  continue reading
 
Think before you speak: Training Language Models With Pause Tokens Towards Self-Assembling Artificial Neural Networks through Neural Developmental Programs Efficient Streaming Language Models with Attention Sinks Large Language Models Cannot Self-Correct Reasoning Yet SmartPlay : A Benchmark for LLMs as Intelligent Agents Support the show…
  continue reading
 
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Finite Scalar Quantization: VQ-VAE Made Simple Support the showBy Marcus Edel
  continue reading
 
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Aligning Large Multimodal Models with Factually Augmented RLHF LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models Support the showBy Marcus Edel
  continue reading
 
CoRF : Colorizing Radiance Fields using Knowledge Distillation The Cambridge Law Corpus: A Corpus for Legal AI Research CodePlan: Repository-level Coding using LLMs and Planning DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion Support the showBy Marcus Edel
  continue reading
 
Parallelizing non-linear sequential models over the sequence length Fast Feedforward Networks LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Boolformer: Symbolic Regression of Logic Functions with Transformers Support the show…
  continue reading
 
FreeU: Free Lunch in Diffusion U-Net Neurons in Large Language Models: Dead, N-gram, Positional DreamLLM: Synergistic Multimodal Comprehension and Creation Kosmos-2.5: A Multimodal Literate Model End-to-End Speech Recognition Contextualization with Large Language Models The Languini Kitchen: Enabling Language Modelling Research at Different Scales …
  continue reading
 
Graph Neural Networks Use Graphs When They Shouldn't Large Language Models for Compiler Optimization OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch Baichuan 2: Open Large-scale Language Models Language Modeling Is Compression FoleyGen: Visually-Guided Audio Generation Support the show…
  continue reading
 
Textbooks Are All You Need II: phi-1.5 technical report DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale MADLAD-400: A Multilingual And Document-Level Large Audited Dataset FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning Optimize …
  continue reading
 
Large-Scale Automatic Audiobook Creation CityDreamer: Compositional Generative Model of Unbounded 3D Cities From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts High-Quality Entity Segmentation Support the show…
  continue reading
 
Large Language Models as Optimizers FLM-101B: An Open LLM and How to Train It with $100K Budget XGen-7B Technical Report Tracking Anything with Decoupled Video Segmentation DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Support the showBy Marcus Edel
  continue reading
 
SLiMe: Segment Like Me Matcha-TTS: A fast TTS architecture with conditional flow matching Physically Grounded Vision-Language Models for Robotic Manipulation Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning Support the showBy Marcus Edel
  continue reading
 
One Wide Feedforward is All You Need Efficient RLHF: Reducing the Memory Usage of PPO PromptTTS 2: Describing and Generating Voices with Text Prompt AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections Support the showBy Marcus Edel
  continue reading
 
Fast Inference from Transformers via Speculative Decoding YaRN: Efficient Context Window Extension of Large Language Models VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Support the show…
  continue reading
 
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge MVDream: Multi-view Diffusion for 3D Generation Can Programming Languages Boost Each …
  continue reading
 
Break-A-Scene: Extracting Multiple Concepts from a Single Image The Poison of Alignment MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records ORES: Open-vocabulary Responsible Visual Synthesis Support the showBy Marcus Edel
  continue reading
 
PMET: Precise Model Editing in a Transformer Interpretable Graph Neural Networks for Tabular Data Nougat: Neural Optical Understanding for Academic Documents Relighting Neural Radiance Fields with Shadow and Highlight Hints Support the showBy Marcus Edel
  continue reading
 
Scalable Diffusion Models with Transformers BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions StableVideo: Text-driven Consistency-aware Diffusion Video Editing Exploiting Diffusion Prior for Real-World Image Super-Resolution Support the showBy Marcus Edel
  continue reading
 
Diversifying AI: Towards Creative Chess with AlphaZero Graph of Thoughts: Solving Elaborate Problems with Large Language Models Dataset Quantization We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond SRFormer: Empowering Regression-Based Text Detection Transformer with Segmentation Support the show…
  continue reading
 
Reinforced Self-Training (ReST) for Language Modeling Large Language Models as General Pattern Machines Anaphoric Structure Emerges Between Neural Networks Consciousness in Artificial Intelligence: Insights from the Science of Consciousness Support the showBy Marcus Edel
  continue reading
 
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model Teach LLMs to Personalize -- An Approach inspired by Writing Education DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory Dual-Stream Diffusion Net for Text-to-Video Generation Support the show…
  continue reading
 
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer Platypus: Quick, Cheap, and Powerful Refinement of LLMs RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs OctoPack: Instruction Tuning Code Large Language Models Support the showBy Marcus Edel
  continue reading
 
Follow Anything: Open-set detection, tracking, and following in real-time AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI Support the showBy Marcus Edel
  continue reading
 
MetaGPT: Meta Programming for Multi-Agent Collaborative Framework FocalFormer3D : Focusing on Hard Instance for 3D Object Detection Shepherd: A Critic for Language Model Generation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models Support the showBy Marcus Edel
  continue reading
 
Separate Anything You Describe Pre-Trained Large Language Models for Industrial Control ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation Simple synthetic data reduces sycophancy in large language models 3D Gaussian Splatting for Real-Time Radiance Field Rendering Support the show…
  continue reading
 
A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search SynJax: Structured Probability Distributions for JAX Support t…
  continue reading
 
RWKV: Reinventing RNNs for the Transformer Era DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models Scaling Relationship on Learning Mathematical Reasoning with Large Language Models The All-Seeing Project: Towards…
  continue reading
 
Predicting masked tokens in stochastic locations improves masked image modeling Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings Support the show…
  continue reading
 
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Discovering Adaptable Symbolic Algorithms from Scratch RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control Guiding Image Captioning Models Toward More Specific Captions LLM-Rec: Personalized Recommendation via Prompting Large Language Models The H…
  continue reading
 
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Med-Flamingo: a Multimodal Medical Few-shot Learner PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization Support the showBy Marcus Edel
  continue reading
 
Scaling TransNormer to 175 Billion Parameters PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation How to Scale Your EMA Support the show…
  continue reading
 
The case for 4-bit precision: k-bit Inference Scaling Laws No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models PUMA: Secure Inference of LLaMA-7B in Five Minutes Optimized Network Architectures for Large Language Model Training with Billions of Parameters A Real-World WebAgent with Planning, Long Context…
  continue reading
 
How will Language Modelers like ChatGPT Affect Occupations and Industries? CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields STEVE-1: A Generative Model for Text-to-Behavior in Minecraft StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-t…
  continue reading
 
Meta-Transformer: A Unified Framework for Multimodal Learning Divide & Bind Your Attention for Improved Generative Semantic Nursing Brain2Music: Reconstructing Music from Human Brain Activity FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets Instruction-following Evaluation through Verbalizer Manipulation Support the show…
  continue reading
 
Loading …

Quick Reference Guide