Posts

Pinned

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Authors Junha Lee1,2,*, Chunghyun Park1,2,*, Jaesung Choe1, Yu-Chiang Frank Wang1, Jan Kautz1, Minsu Cho2, Chris Choy1 1NVIDIA, 2POSTECH * indicates equal contribution Abstract We tackle open-...

Feb 4, 2025 Publications,Computer Vision

PeRFception: Perception using Radiance Fields

Authors Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Anima Anandkumar, Minsu Cho, Jaesik Park NeurIPS, 2022 Abstract The recent progress in implicit 3D representation, ie, Neural ...

Jun 1, 2022 Publications,Computer Vision

Deep Global Registration

We present Deep Global Registration, a differentiable framework for pairwise registration of real-world 3D scans. Deep global registration is based on three modules: a 6-dimensional convolutional ...

Apr 23, 2020 publication

High dimensional Convolutional Neural Networks for 3D Perception

Abstract The automation of mechanical tasks brought the modern world unprecedented prosperity and comfort. However, the majority of automated tasks have been simple mechanical tasks that only requ...

Apr 7, 2020 Publications,Thesis

Fully Convolutional Geometric Features

Authors Christopher Choy, Jaesik Park, Vladlen Koltun International Conference on Computer Vision (ICCV), 2019 Speed vs. Accuracy Pareto optimal frontier of previous methods and ours. Abstrac...

Jul 22, 2019 Publications,Computer Vision

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-fr...

Dec 25, 2018 publication

All Posts

CuTe DSL Basics: A Practical Introduction

CuTe DSL Basics — From Hello to Tiled Kernels This tutorial turns the CuTe DSL script snippets into a connected story: we start with a first GPU kernel, learn how dynamic printing and data types w...

Aug 28, 2025 Programming,Tutorial

CUDA Memory Load/Store Performance: A Comprehensive Benchmark Analysis

CUDA Memory Load/Store Performance: A Comprehensive Benchmark Analysis GPU memory performance is often the bottleneck in high-performance computing applications. Understanding the nuances of diffe...

Jul 17, 2025 Programming,CUDA

Monocular Dynamic View Synthesis: A Reality Check

Authors Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, Angjoo Kanazawa, Christopher Choy NeurIPS, 2023 Abstract Indoor scene reconstruction from monocular images has long been sought af...

Sep 1, 2023 Publications,Computer Vision

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Authors Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, Katerina Fragkiadaki, Christopher Choy, Zackory Erickson, David Held RSS, 2022 Abstract Manipulating volumetric deformabl...

Jun 1, 2022 Publications,Computer Vision

Self-Calibrating Neural Radiance Fields

Authors Yoonwoo Jeong, Seokjun Ahn, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park ICCV, 2021 Abstract In this work, we propose a camera self-calibration algorithm for generic ...

Sep 1, 2021 Publications,Computer Vision

Learning 3D Representations of Dynamic Environments from a Single Camera

Authors Gengshan Yang, Minh Vo, Neverova Natalia, Deva Ramanan, Andrea Vedaldi, Christopher Choy CVPR, 2021 Abstract Learning 3D representations of dynamic environments from a single camera pre...

Jun 1, 2021 Publications,Computer Vision

Ghost of 3D Perception: Permutation Invariance Matters? Convolutions are Permutation Invariant!

Are you familiar with the python dictionary class? Let me give you a quick test to check your level of knowledge. a = dict() a[1.1] = 1 a[2.1] = 2 b = dict() b[2.1] = 2 b[1.1] = 1 Do you think t...

May 27, 2021 research

Faster Neural Radiance Fields Inference

The Neural Radiance Fields (NeRF) proposed an interesting way to represent a 3D scene using an implicit network for high fidelity volumetric rendering. Compared with traditional methods to generate...

May 4, 2021 Research,Computer Vision

Setting Class Attributes in Python

Setting class attributes in python can be tedious. In this post, I want to summarize a trick that I’ve been using to simplify this process. Class Attributes in init In many cases, we have to save...

May 4, 2021 programming

Misconceptions about Memory and Good Documentation

Documentation probably is one of the most important tasks that no one has time for. I also overlook the importance as I get swept by a series of projects and requests. Recently, however, I learn mo...

Apr 20, 2021 unclassified

High-dimensional Convolutional Networks for Geometric Pattern Recognition

Many problems in science and engineering can be formulated in terms of geometric patterns in high-dimensional spaces. We present high-dimensional convolutional networks (ConvNets) for pattern recog...

Apr 23, 2020 publication

Pytorch Extension with a Makefile

Pytorch is a great neural network library that has both flexibility and power. Personally, I think it is the best neural network library for prototyping (advanced) dynamic neural networks fast and ...

Dec 28, 2018 programming

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

Abstract We present a method for generating colored 3D shapes from natural language. To this end, we first learn joint embeddings of freeform text descriptions and colored 3D shapes. Our model com...

Mar 2, 2018 Publications,Computer Vision

Short Note on Matrix Differentials and Backpropagation

Mathematical notation is the convention that we all use to denote a concept in a concise mathematical formulation, yet sometimes there is more than one way to express the same equation. For example...

Jan 10, 2018 research

Regression vs. Classification: Distance and Divergence

In Machine Learning, supervised problems can be categorized into regression or classification problems. The categorization is quite intuitive as the name indicate. For instance, if the output, or t...

Jan 5, 2018 research

Data Processing Inequality and Unsurprising Implications

We have heard enough about the great success of neural networks and how they are used in real problems. Today, I want to talk about how it was so successful (partially) from an information theoreti...

Jan 4, 2018 research

Learning Gaussian Process Covariances

A Gaussian process is a non-parametric model which can represent a complex function using a growing set of data. Unlike a neural network, which can also learn a complex functions, a Gaussian proces...

Dec 15, 2017 research

DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image

3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative mo...

Aug 18, 2017 publication

Weakly Supervised 3D Reconstruction with Manifold Constraint

Volumetric 3D reconstruction has witnessed a significant progress in performance through the use of deep neural network based methods that address some of the limitations of traditional reconstruct...

Jun 1, 2017 publication

Expectation Maximization and Variational Inference (Part 2)

In the previous post, we covered variational inference and how to derive update equations. In this post, we will go over a simple Gaussian Mixture Model with the Dirichlet prior distribution over t...

Mar 23, 2017 research

Scene Graph Generation by Iterative Message Passing

Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we expli...

Mar 14, 2017 publication

DESIRE: Deep Stochastic IOC RNN Encoder-decoder for Distant Future Prediction in Dynamic Scenes with Multiple Interacting Agents

We introduce a Deep Stochastic IOC1 RNN Encoder- decoder framework, DESIRE, with a conditional Variational Auto-Encoder and multiple RNNs for the task of future predictions of multiple interacting ...

Mar 14, 2017 publication

SegCloud: Segmantic Segmentation of 3D Point Clouds

Abstract 3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works lever...

Mar 5, 2017 Publications,Computer Vision

Expectation Maximization and Variational Inference (Part 1)

Statistical inference involves finding the right model and parameters that represent the distribution of observations well. Let $\mathbf{x}$ be the observations and $\theta$ be the unknown paramete...

Feb 26, 2017 research

Dirichlet Process Mixtures and Inference (Part 1)

Statistical inference often requires modeling the distribution of data. There are two branches of statistical modeling: parametric and non-parametric methods. The former one specifies the data dist...

Dec 27, 2016 research

Universal Correspondence Network

We present a deep learning framework for accurate visual correspondences and demonstrate its effectiveness for both geometric and semantic matching, spanning across rigid motions to intra-class sha...

Sep 26, 2016 publication

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Recon...

Sep 23, 2016 publication

Caffe Python Layer

Python layer in Caffe can speed up development process Issue1703 Compile WITH_PYTHON_LAYER option First, you have to build Caffe with WITH_PYTHON_LAYER option 1. Run make clean to delete all the ...

Jul 21, 2015 research

Gentle Introduction to Gaussian Process Regression

Parametric Regression uses a predefined function form to fit the data best (i.e, we make an assumption about the distribution of data by implicitly modeling them as linear, quadratic, etc.). Howev...

Jun 12, 2015 research

Reading protobuf DB in Python

Caffe uses Google Protocol buffer and LMDB or LevelDB to save data in a single unified database file. This allows faster data loading. Saving Database in LMDB I will not cover this step. If you a...

Apr 27, 2015 research

Barycentric Coordinate for Surface Sampling

To convert a mesh into a point cloud, one has to sample points that can uniformly cover the surface. To do so, one must choose the number of samples proportional to the area of a face (polygon). F...

Mar 26, 2015 research

Making a Caffe Layer

Caffe is one of the most popular open-source neural network frameworks. It is modular, clean, and fast. Extending it is tricky but not as difficult as extending other frameworks. Files to modify o...

Feb 26, 2015 research

Computing Neural Network Gradients

Computing the neural network gradient requires very simple calculus, yet can be tedious. Affine Transformation (Fully Connected Layer) Gradients For a simple fully connected layer with batch size...

Feb 5, 2015 research

Interesting Properties of Matrix Norms and Singular Values

$ \DeclareMathOperator*{\argmax}{arg\,max} $ Matrix norms and singular values have special relationships. In this post, I’ll summarize a few interesting properties of matrix norms and singu...

Jan 13, 2015 research

Posts

Pinned

All Posts

Trending Tags