AI with Papers - Artificial Intelligence & Deep Learning

All the AI with papers. Every day fresh updates about#DeepLearning#MachineLearning#LLM&#ComputerVisionCurated by Alessandro Ferrari |https://www.linkedin.com/in/visionarynet/#AI#chatGPT

Members: 17.5K

Score: 90/100

Category: AI / Tech

Updated: Apr 10, 2026

Topics

What is Material Magic Wand for 3D mesh material grouping?

1 posts

What is OccAny? Universal 3D Occupancy Prediction Framework

1 posts

What is INSID3? Training-Free In-Context Segmentation Explained

1 posts

What is Samsung RawGen? Camera Raw Image Generation Explained

1 posts

Best Frontier Tech Investment: Space, AI Agents, Quantum or GPUs?

1 posts

Netflix Void: AI Video Object Removal Framework Explained

1 posts

HandX: Bimanual Hand Motion Dataset and Foundation Model

1 posts

Geoffrey Hinton Interview in Pavia: Watch the Full Clip

1 posts

Official Clip Release Date and Preview Details

1 posts

IndustryShapes Dataset for 6D Pose Estimation of Industrial Tools

1 posts

What is Holi-Spatial? Automated 3D Spatial Intelligence from Video

1 posts

Apple LITO: Latent Flow Matching for Image-to-3D Conversion

1 posts

PhysMoDPO: Physically-Plausible Human Motion Generation Framework

1 posts

Fast SAM 3D Body: 10,000x Faster Humanoid Control from RGB

1 posts

What is GaussianGPT? 3D Scene Generation with Transformers

1 posts

Vanast: AI Video Garment Transfer with Human Animation

1 posts

What is DVD? New SOTA Video Depth Estimation Model

1 posts

What is BoxerNet? Meta's 2D to 3D Bounding Box Transformer

1 posts

What is OmniStream? Unified Visual Backbone for Streaming

1 posts

What is PAM for Hand-Object Interaction Video Generation?

1 posts

Generalized Human Tracking: Beijing Institute of Technology Framework

1 posts

Google FIT Dataset: 1.1M Virtual Try-On Images & Measurements

1 posts

Recent Posts

🪞1.1M Metric VTON Dataset🪞👉Google'sFit-Inclusive Try-on: large-scale VTO dataset comprising over 1.13M try-on image triplets accompanied by precise body and garment measurements. Repo & dataset announced💙👉Reviewhttps://t.ly/cs-pt👉Paperarxiv.org/pdf/2604.08526👉Projectjohannakarras.github.io/FIT/👉Repo TBA

Apr 10, 2026 394 views

Here the preview, tomorrow the full clip from official source :)

Apr 8, 2026 1.3K views

Hinton our guest in Pavia (remotely)💚😈Would you see a clip about the interview?

Apr 8, 2026 1.4K views

🔥BoxerNet: SOTA 2D->3D BBs🔥👉Boxer by META: transformer-based network to lift 2D BB proposals into 3D, followed by multi-view fusion and geometric filtering to produce globally consistent de-duplicated 3DBBs in metric world space. Repo under A-NC 4.0 International💙👉Reviewhttps://t.ly/mlmV1👉Paperhttps://arxiv.org/pdf/2604.05212👉Projectfacebookresearch.github.io/boxer/👉Repogithub.com/facebook...

Apr 8, 2026 1.7K views

🔥Vanast: VTON w/ Human Animation🔥👉SNU unveils a novel unified framework that generates garment-transferred human animation videos directly from a single human/garment images, and pose guidance clip. Repo announced💙👉Reviewhttps://t.ly/c0t79👉Paperarxiv.org/pdf/2604.04934👉Projecthyunsoocha.github.io/vanast/👉Repogithub.com/snuvclab/vanast

Apr 7, 2026 1.9K views

🍎Video Object Deletion🍎👉Void by Netflix is a novel video object removal framework designed to perform physically-plausible inpainting in very complex scenarios. Repo under Apache 2.0💙👉Reviewhttps://t.ly/cMVny👉Paperhttps://arxiv.org/pdf/2604.02296👉Projecthttps://void-model.github.io/👉Repohttps://github.com/Netflix/void-model

Apr 3, 2026 3.2K views

If you have to invest TODAY 1B$ on a frontier tech for the next decade, would you invest in space, agentic, quantum or frugal GPUs? Vote here:https://t.ly/hSx6i

Apr 2, 2026 2.8K views

🪬Camera Raw Image Generation🪬👉RawGen by#Samsungis a generative approach that learns the complex distribution of raw sensor data directly, enabling high-fidelity generation from either text descriptions or standard sRGB images across arbitrary camera sensors. Linear raw image once, then apply any ISP operation. Repo announced💙👉Reviewhttps://t.ly/_QVKP👉Paperhttps://arxiv.org/pdf/2604.00093👉Pr...

Apr 2, 2026 2.8K views

🌵SOTA Training-Free In-Context Segmentation🌵👉INSID3 is the new SOTA, training-free approach that segments concepts at varying granularities only from frozen DINOv3 features, given an in-context example. Repo under Apache 2.0💙👉Reviewhttps://t.ly/NVWHN👉Paperarxiv.org/pdf/2603.28480👉Projectvisinf.github.io/INSID3/👉Repogithub.com/visinf/INSID3

Apr 1, 2026 2.8K views

👌HandX: Scaling Hands Motion👌👉HandX is a unified foundation spanning data, annotation, and evaluation: novel large-scale dataset of bimanual & dexterous motions with fine-grained textual. Around 6M frames. Repo available💙👉Reviewhttps://t.ly/1nGxw👉Paperhttps://arxiv.org/pdf/2603.28766👉Projecthttps://handx-project.github.io/👉Repogithub.com/handx-project/HandX

Mar 31, 2026 2.7K views

💥GaussianGPT 3D GSC💥👉From TUM, GaussianGPT: transformer-based 3D Gaussians generation via next-token prediction -> full 3D complex indoor scene. Repo announced💙👉Reviewhttps://t.ly/bj-lL👉Paperarxiv.org/pdf/2603.26661👉Projectnicolasvonluetzow.github.io/GaussianGPT/👉Repo TBA

Mar 31, 2026 2.7K views

🐍Pose-Appearance-Motion for HOI🐍👉PAM is a novel Pose–Appearance–Motion Engine for controllable Hand–Object Interaction SOTA video generation. Repo/models available💙👉Reviewhttps://t.ly/JU4MD👉Paperarxiv.org/pdf/2603.22193👉Projectgasaiyu.github.io/PAM.github.io/👉Repohttps://github.com/GasaiYU/PAM

Mar 25, 2026 4.2K views

🦪OccAny: Universal 3D Occupancy🦪👉OccAny by Valeo is a novel unified framework for generalized unconstrained urban 3D occupancy prediction. Repo under Apache 2.0💙👉Reviewhttps://t.ly/FFiU0👉Paperhttps://arxiv.org/pdf/2603.23502👉Projecthttps://valeoai.github.io/OccAny/👉Repohttps://github.com/valeoai/OccAny

Mar 25, 2026 4.2K views

🍓Material-Aware Grouping🍓👉Material Magic Wand (Adobe) is a tool for material-aware grouping of parts in untextured 3D meshes. Given one selected part, it automatically retrieves the other parts in the same shape by its material. Repo announced💙👉Reviewhttps://t.ly/q00SU👉Paperhttps://arxiv.org/pdf/2603.17370👉Projectumangi-jain.github.io/material-magic-wand/👉Repo TBA

Mar 19, 2026 5.1K views

🍧10,000× faster SAM-3D🍧👉Fast SAM 3D Body achieves up to 10.9× speedup, over 10,000× faster MHR-to-SMPL conversion -> real-time humanoid control from RGB. Repo available💙👉Reviewhttps://t.ly/uHx84👉Paperhttps://arxiv.org/pdf/2603.15603👉Projectyangtiming.github.io/Fast-SAM-3D-Body-Page/👉Repohttps://github.com/yangtiming/Fast-SAM-3D-Body

Mar 17, 2026 4.5K views

🤖Physically-Plausible Human🤖👉PhysMoDPO is a novel direct preference optimization framework for humanoid motion generation. Repo under MIT💙👉Reviewhttps://t.ly/clf8w👉Paperhttps://arxiv.org/pdf/2603.13228👉Projecthttps://mael-zys.github.io/PhysMoDPO/👉Repohttps://github.com/Mael-zys/PhysMoDPO

Mar 16, 2026 4.1K views

🌈New SOTA Video Depth🌈👉DVD is the new Video Depth Estimation SOTA with full training suite available under Apache2.0💙👉Reviewhttps://t.ly/gpCkG👉Paperhttps://arxiv.org/pdf/2603.12250👉Projecthttps://dvd-project.github.io/👉Repogithub.com/EnVision-Research/DVD

Mar 13, 2026 4.4K views

☄️OmniStream Backbone☄️👉Novel unified streaming visual backbone that effectively perceives, reconstructs, and acts from diverse visual inputs. Repo/Models announced💙👉Reviewhttps://t.ly/_zZMO👉Paperarxiv.org/pdf/2603.12265👉Projectgo2heart.github.io/omnistream/👉Repogithub.com/Go2Heart/OmniStream

Mar 13, 2026 4.2K views

🍓Surface Light Tokenizer🍓👉Apple unveils LITO a novel latent flow matching model enables HQ image-to-3D. Latent representation that encodes a surface light field into a compact set of latent vectors. Impressive results but no code🥲👉Reviewhttps://t.ly/xcWNe👉Paperhttps://lnkd.in/dYHwY4YX👉Projecthttps://lnkd.in/dtJT8bXy

Mar 12, 2026 4.1K views

🔥Holistic 3D Spatial Intelligence🔥👉Holi-Spatial is the first fully automated pipeline capable of converting raw video streams into holistic 3D spatial annotations without human intervention. Code/Data announced💙👉Reviewhttps://t.ly/PDpr9👉Paperhttps://lnkd.in/dTbMuZCm👉Projecthttps://lnkd.in/d66CYB4q👉Repohttps://lnkd.in/dAGzShXj

Mar 11, 2026 4.0K views

🤖Generalized Human Tracking🤖👉Beijing Institute of Technology & Humanoid Robotics Shangai present a novel learning framework for general humanoid whole-body control. Impressive results in imitation.👉Reviewhttps://t.ly/ucmuB👉Paperarxiv.org/pdf/2601.23080👉Projectzeonsunlightyu.github.io/RGMT.github.io

Feb 13, 2026 4.4K views

🛠️IndustryShapes 6D Pose🛠️👉IndustryShapes by NTUA is a new RGB-D dataset of industrial tools, designed for both instance-level and novel object 6D pose estimation. Dataset available💙👉Reviewhttps://t.ly/KKcuH👉Paperhttps://arxiv.org/pdf/2602.05555👉Projecthttps://pose-lab.github.io/IndustryShapes/👉Datasethttps://huggingface.co/datasets/POSE-Lab/IndustryShapes

Feb 12, 2026 4.4K views