Open AI rang in the new year with a major announcement: two new revolutionary pieces of research: 1)DALL-E which can generate images from text, and 2)CLIP which provides a one-shot image classification approach without the requirement of training a model. This article focuses on CLIP, specifically, how the Vector robot can classify objects that it sees as long as an input list of possible text sequences that describe the expected objects is provided.

First, what is the big deal about CLIP. One of the major challenges in deep learning is the requirement of labelled datasets required to train a model

Price shown above is a sample illustrative price. Order at


For most of my career, I have been an engineer solely focused on the performance aspects of computer operating systems. Three years ago, I started driving an analytics and AI intensive feature for an emerging product at my current employer. Learning the subtleties of Artificial Intelligence (AI) was an uphill struggle for me. Luckily, at the same time, I was also a beta tester for a robotics company called Anki, who were releasing their first full fledged toy robot, the Cozmo. Along with Cozmo, Anki also released a full fledged SDK to provide the ability to program Cozmo using…

Photo by Michal Balog on Unsplash

As many of you know, I am an avid robotics enthusiast. I own Anki robots: Cozmo and Vector as well as the Bittle robot by OpenPetoi. Many of you may know me as the creator of ‘Learn AI with a robot’ to make learning AI simple. Over the last 2 years, I have been publishing interesting content over this Medium publication: “Programming Robots”. While this was a very fruitful and rewarding experience, and I enjoyed hearing back from many of you, I am sad to announce that this journey on the Medium platform is now ending. I am now moving…

One of the few successes that Anki robots, Vector and Cozmo, (now owned by Digital Dream Labs) have had is in the exploration of Human Robot Interaction (HRI). HRI is a multi-discliplinary research field that explores how humans perceive robots and interact with them. This is a very important research area because the success of robotics in everyday environments such as the home, schools, or shopping malls, depends on how well humans deal with the presence of robots and cooperate with them on tasks that the robots are designed to let the humans accomplish. …

I am a proud father of a new programmable robot, a pet dog Bittle. Bittle is the second robot made by Petoi, and a successor to Nybble. While Anki Vector (now owned by DDL) is a great programable robot for Artificial Intelligence (AI) based tasks such as perception, path planning as well animations, Bittle is a pretty sophisticated mechanical quadruped robot. It is a close lookalike of Spot from Boston Dynamics but at 1/300th the cost, making it viable as a home companion. Petoi founded by Rongzhong Li is the developer of OpenCat, the open-source quadruped robot platform.The …

In my last article, we examined how OpenAI CLIP can classify an image amongst multiple options of provided text (prompts) using a pre-trained model; thus providing the ability of zero shot image classification. We have also examined how helps you automate your data ingestion pipeline by providing you with a large variety of data preprocessing and augmentation techniques and the ability to export the dataset in multiple formats.

Thanks, to this nice notebook provide by, one can now take a shot at understanding the potential of OpenAI CLIP inference on multiple datasets, including the Anki Vector Robot dataset

Photo by Andy Kelly on Unsplash

One of the most popular videos in my YouTube channel is an illustration of how one can draw on Cozmo’s face by simply moving the cube around. The program is simple to write using the Cozmo Codelab. But there are very important concepts to learn here. One of the most essential jobs of commercial robots is to grasp and move objects. Behind this technology is the art of determining the pose of an object, known as 6D Object Pose Estimation (6DOPE). Before, going into a deep dive on 6D OPE, I would first like you to review the following video.

In my previous post, I discussed how features the Anki Vector public dataset, which can be used to train the Vector too recognize another Vector using the Anki Vector SDK. This post captures how you can add your own images to the dataset, and build your own dataset. More images add more diversity to the dataset, and lead to better trained models. Here is what you will have to do.


You will need the following:

  1. Two Anki Vector robots in working condition.
  2. Anki Vector SDK Dev environment setup, hopefully in a Ubuntu or alternative Linux environment. …


Recently, I published an article in “Towards AI” on how to train a YOLO v5 model that can be used by Vector to detect another Vector. Brad Dwyer, founder of Roboflow, read the article and reached out to me to offer a free upgraded Roboflow account which provides for all the functionality that Roboflow including the ability to release publically available datasets. Thanks Brad!

So that motivated me to work further on this. As I have documented before, the main barrier to using supervised Machine Learning (ML) aka Deep Learning (DL)is the requirement of large volumes of high quality labelled…


Avid biker. VMware engineer. Robotics. Thoughts in this forum reflect my own opinions. Write about Robotics, Vector, Cozmo, and VMware.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store