Back to Remote jobs > Artificial Intelligence > computer vision engineer

AI Research Engineer: Vision AI / VLM / Physical AI @Centific

Artificial Intelligence

Salary $140k - $150k a..	Remote Location 🇺🇸 USA Only
Employment Type full-time	Posted 1mth ago

[Hiring] AI Research Engineer: Vision AI / VLM / Physical AI @Centific

1mth ago - Centific is hiring a remote AI Research Engineer: Vision AI / VLM / Physical AI. 💸 Salary: $140k - $150k annually 📍Location: USA

Role Description

Are you pushing the frontier of computer vision, multimodal large models, and embodied/physical AI—and have the publications to show it? Join us to translate cutting-edge research into production systems that perceive, reason, and act in the real world.

We are building state-of-the-art Vision AI across 2D/3D perception, egocentric/360° understanding, and multimodal reasoning. As an AI Research Engineer, you will own high-leverage experiments from paper → prototype → deployable module in our platform.

You could be part of:

Computer Vision team: Dive into the world of 3D reconstruction, scene understanding, and visual AI. Explore innovative techniques like 3D Reconstruction projects and work with architectures like VGG-T (Visual Geometry Grounded Transformers).
Physical AI Robotics team: Work at the intersection of simulation, robotics, and AI. Leverage NVIDIA’s Omniverse, Isaac Sim, and GR00T for advanced 3D simulation and robotics training.

What You’ll Do

Advance Visual Perception: Build and fine-tune models for detection, tracking, segmentation (2D/3D), pose & activity recognition, and scene understanding (incl. 360° and multi-view).
Multimodal Reasoning with VLMs: Train/evaluate vision-language models (VLMs) for grounding, dense captioning, temporal QA, and tool use.
Physical AI & Embodiment: Prototype perception-in-the-loop policies that close the gap from pixels to actions.
Data & Evaluation at Scale: Curate datasets, author high-signal evaluation protocols/KPIs, and run ablations.
Systems & Deployment: Package research into reliable services on a modern stack (Kubernetes, Docker, Ray, FastAPI).
Agentic Workflows: Orchestrate multi-agent pipelines that combine perception, reasoning, simulation, and code generation.

Example Problems You Might Tackle

Long horizon video understanding from egocentric or 360° video.
3D scene grounding: linking language queries to objects, affordances, and trajectories.
Fast, privacy preserving perception for on-device or edge inference.
Robust multi-modal evaluation: temporal consistency, open-set detection, uncertainty.
Vision conditioned policy evaluation in simulation with sim2real stress tests.

Qualifications

Masters/Ph.D in CS/EE/Robotics (or related), actively publishing in CV/ML/Robotics.
Strong PyTorch (or JAX) and Python; comfort with CUDA profiling and mixed precision training.
Demonstrated research in computer vision and at least one of: VLMs, embodied/physical AI, 3D perception.
Proven ability to move from paper → code → ablation → result with rigorous experiment tracking.

Preferred Qualifications

Experience with video models (e.g., TimeSFormer/MViT/VideoMAE), diffusion or 3D GS/NeRF pipelines, or SLAM/scene reconstruction.
Prior work on multimodal grounding or temporal reasoning.
Familiarity with ROS2, DeepStream/TAO, or edge inference optimizations.
Scalable training: Ray, distributed data loaders, sharded checkpoints.
Strong software craft: testing, linting, profiling, containers, and reproducibility.
Public code artifacts (GitHub) and first-author publications or strong open source impact.

What Success Looks Like

A publishable or open-sourced outcome (with company approval) or a production-ready module that measurably moves a product KPI.
Clean, reproducible code with documented ablations and an evaluation report that a teammate can rerun end-to-end.
A demo that clearly communicates capabilities, limits, and next steps.

Benefits

Real impact: Your research ships—powering core features in our MVPs and products.
Mentorship: Work closely with our Principal Architect and senior engineers/researchers.
Velocity + Rigor: We balance top-tier research practices with pragmatic product focus.
Salary: $140K - $150K Annually.

Similar Remote Jobs

Mid/Senior AI Cinematic Video Editor • EverAI EverAI

Artificial Intelligence Worldwide

2wks ago
Apply See more >
Business Transformation Lead • Expion Health Expion Health

Artificial Intelligence $175k - $225k USA Only

3wks ago
Apply See more >
Director of Revenue Systems and AI Automation (Offshore) • Caul Group Caul Group

Artificial Intelligence $60k–$72k LATAM

3wks ago
Apply See more >

Kickstart Your Job Search

⚡ 13,400 remote jobs added this week

You're seeing 0.4% of available roles

Unlock 160,000+ jobs →

Meet JobCopilot: Your Personal Al Job Hunter

Automatically Apply to Remote Jobs

Try it now →

Before You Apply

️

🇺🇸	Be aware of the location restriction for this remote position: USA Only
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Back to Remote jobs > Artificial Intelligence > computer vision engineer

AI Research Engineer: Vision AI / VLM / Physical AI @Centific

Artificial Intelligence

Salary $140k - $150k a..	Remote Location 🇺🇸 USA Only
Employment Type full-time	Posted 1mth ago

Apply for this position

Unlock 160,000+ Remote Jobs

️

🇺🇸	Be aware of the location restriction for this remote position: USA Only
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Apply for this position

Unlock 160,000+ Remote Jobs

[Hiring] AI Research Engineer: Vision AI / VLM / Physical AI @Centific

Apply to the best remote jobsbefore everyone else

Apply to the best remote jobs
before everyone else