Video

Chariot Datasets Overview

With Striveworks' Chariot, users can easily create and catalog custom and sliceable datasets, speeding up data preparation for rapid model development. 

Supercharged Dataset Prep in 2 Minutes with Chariot

Transcript
(Automated transcription)
 
Hi, I'm Eric Korman, Chief Science Officer at Striveworks.

Creating and managing quality data sets for machine learning, model training, and evaluation can be a real pain.
 
With Striveworks' Chariot, though, data scientists can easily create and catalog custom and sliceable datasets, speeding up data preparation for rapid model development. 

So, here we'll take a look at the computer vision example. And, so, in this case, Chariot stores and versions, not just the underlying images, but also the annotations and allows you to, on platform, adjust annotations and create new ones, as necessary.

Another important thing with the dataset service is it gives you a nice history of exactly what went into that dataset, when new annotations were added, and so forth.
 
The service also allows filtering by various metadata. For example, maybe you just want to narrow in on a dataset from a specific period of time. That's something that the dataset service allows you to do.
 
Of course, before you start model training, you want to create a good train test, val, split. And Chariot gives you a lot of power and flexibility in creating that split.

So, besides just the percentages you want in train, test, and val, you can also filter by, say, task type, specific labels you want to include in that split, or maybe you just want to restrict to data that was captured in a certain time period. Chariot gives you all those options to really nail down on the exact data you want to use for model development.

After creating a view, you create a snapshot which fixes a point in time with the data. This is very important because, in the real world, datasets are not static. You're constantly getting new data, and as you get new data, you need to add some of it to the train, split, some to the val, and so forth. And you want to do that in a way to make sure that you have no data leakage.
 
And, so, Chariot does all of that for you.