Loopy News: February 2022
tl:dr launched the Database for AI, rolled out Hub 2.3.0 (enhanced video datasets, Aim + Hub integration, and 5 more features), KDnuggets feature, & 50% off MLConf tickets since we’re presenting there
The Database for AI is here
I’m not gonna lie, February was busy.
We’ve been working on the Database for AI for the past year—this month we shared it with the ML community. Below is an overview of some of the main features of the Activeloop Platform. Follow Davit’s Linkedin for more Activeloop Platform updates & sneak peeks, and check out our new Machine Learning Datasets Catalogue here for details.
Quickly visualize data of any size to mitigate bias in datasets & build highly accurate ML models
Our visualization engine now also works locally. Just drag. And drop.
Dataset version control + visualization = 🧡. Track changes of datasets by your entire team and visualize each commit, instantly.
Hub 2.3.0 is out
Faster & more robust video datasets
- Store and stream videos of any size
- Use
ds.videos[sample_index][frame_index].numpy()
to rapidly lazy-load specific video frames without downloading and decompressing the entire video - 2X faster video decompression
- Take a look at a demo of these features made by Fayaz Rahman, an engineer at Activeloop
Aim + Hub integration
- Hub datasets now interface seamlessly with Aimstackio’s open-source experiment tracking framework, Aim—an easy-to-use and performant open-source experiment tracker. Check out the integration details and the repository code.
Enhancements to Hub
- Improved the retry behavior on S3 in order to minimize the likelihood of timeout errors
- Import data into hub directly from URLs using
hub.read(URL)
Customizable installation, dimensionality & conventions for each type explained in new docs
- 4x faster and more customizable pip installation explained here
- New Htype doc (understand dimensionality & conventions for each htype)
News on the Hub blog
- KDnuggets blog post feature: From Oracle to Databases for AI: The Evolution of Data Storage
- Reddit post: Database for AI: Visualize, version-control & explore image, video and audio datasets. Help us get to 1000 upvotes :)
Catch us at MLconf 2022 on March 31st, presenting the Database for AI
Davit will be presenting the Database for AI at MLconf NYC. Get tickets at 50% off with our Activeloop discount here. Hope to see you there :)
Hub and Activeloop Updates event
Check out the recording of the Hub and Activeloop Updates event where Ivo, Davit, Sasun & Mikayel covered the roadmap for the future of hub and the Activeloop Platform. They talked about brand new features of the Activeloop Platform such as dataset visualization, version control, and querying.
Community heroes 🧡
Shout out to Alexander Demidko and Gaurav Sukhramani for contributing to hub we really appreciate it 😄
Thank you to Amal Mathew, Abid Ali Awan, Alok Chilka, Dyllan McCreary, Xuechen Liu, Sergii Rizvan, Muhammad Haritsah Mukhlis, Parth Dandavate & Gaurav Sukhramani for helping us build the Database for AI by giving us community feedback :)