A few weeks ago, the Superb AI team sponsored and attended the International Conference on Computer Vision (ICCV) 2019 held in Seoul, Korea. The ICCV is one of the largest conferences in Computer Vision, comprised of the main conference, workshops, tutorials and exhibitions where machine learning researchers, engineers, and students from both academia and industry meet to discuss the latest advances in the field. The ICCV 2019 took place in Seoul, Korea from October 26 to November 2 and set a new record with twice as many submitted papers and participants from all over the world as compared to previous years.
Superb AI was there at ICCV to demonstrate the Superb AI Suite — our new generation machine learning data platform designed for ML teams to accelerate their ML dev cycle. We introduced a pre-launch version of the Superb AI Suite as part of our exhibition and more than 700 attendees registered to our Suite waitlist. We are grateful to all the ML researchers, students, and industry experts who stopped by our booth to try out the demo and gave us a hand with our survey on machine learning data management. More details on our presence at ICCV 2019 and our learnings can be found below:
Introducing the Superb AI Suite
We wrote a technical white paper entitled “Superb AI Suite: Next Generation Data Platform for ML Teams” that describes our motivation behind developing the Superb AI Suite, its features and future direction. It is an introduction to Superb AI’s vision of designing and building a machine learning data management platform that all stakeholders in the ML data industry (researchers, engineers, PMs, and annotators) can collaborate on.
You can now download our white paper here👇
📊 What ICCV attendees say about ML data:
As part of this effort, we’ve surveyed and gathered responses from over a thousand attendees on many pressing issues in creating and managing machine learning data. Firstly, we wanted to understand the participants’ profiles so we started by asking for their affiliations. The majority of attendees were active academics in the field, ranging from students (52%) to professors (4%). Superb AI offers an academia discount for the Suite in addition to our Early Bird promotion, so if you’re a student reading this, reach us at email@example.com!
Secondly, We asked for each ICCV conference participant’s experience in handling machine learning data, how they are currently managing data collection and labeling, and what the most inconvenient or difficult part of it is:
It turns out, 39% of the respondents found annotation accuracy or turnaround time to be a critical challenge in ML projects. In addition, nearly a quarter of the respondents found these four aspects equally challenging:
- Customizing data tools to fit ever-changing specifications and requirements.
- Communicating with stakeholders (annotators, reviewers, etc.) regarding annotation issues.
- Utilizing existing, pre-trained ML models to generate “pre-labels” and auto-label the data.
- Uploading, downloading and sharing data and annotations with colleagues.
Although it seems quite clear that creating and managing ML dataset is still very challenging even for experienced engineers and researchers, we were pleased to see these results as all these pain points are exactly what Superb AI and our Suite aim to solve.
We also saw that most participants manage data collection and labeling by themselves, instead of relying on internal, in-house dedicated team or outsourcing to third-party services. This was also as expected because most participants of the conference were academics who usually work with opensource datasets. Amongst industry participants, the response was split evenly between using in-house teams and outsourcing to a third-party. Superb AI Suite is suitable for all these machine learning teams and use-case. ML teams that utilize opensource datasets can easily manage ginormous datasets with our powerful filter, search and collaboration features. Teams that create new datasets with in-house teams can also benefit from our project customization features, role-based access control, and issue tracking. Finally, any team can also outsource data collection and labeling services as an add-on to the Superb AI Suite, monitor the project progress and give feedback in real-time so that they don’t have to waste months waiting for the dataset to be complete.
The next question was about which features of the Superb AI Suite they find the most useful for their ML project. A whopping 52% of the respondents pointed to the visualization of data, labeling, and statistics as being the most desirable. Visualizing the data, annotations, and their distribution is an important step to understanding a dataset and successfully trained a high-performance ML model; however, it is also one of the steps that often require one-off custom code.
Also, being able to integrate existing ML models received high interest. Most participants were ML experts that already have somewhat performing ML models, whether opensource or custom-built, and it would be a waste not to utilize them for labeling data. We 100% understand these needs as we were once in their shoes, and so these two are some of the most important features for the Superb AI Suite dev team. Interestingly, project managers (those who manage in-house teams or third-party services to create new datasets) pointed to a different set of features that we’ve developed, such as issue tracking, search, and filter, as being the most useful. Likewise, the Suite is designed and built for various stakeholders around ML data.
Lastly, We also asked an open-ended question about which features they desire the most. We got some great ideas from their responses, and these are some of the candidate features we’ve added to our backlog:
- Running AutoML with pre-labeled data
- Feature extraction and visualization
- Advanced dashboard to monitor the labeling team and project progress
At ICCV 2019, our goal was to present the core features of the Superb AI Suite to computer vision engineers and researchers and to receive their feedback so we can improve our product to fit their needs. Our team gained valuable insights from the survey results during the four-day demo sessions, and we were able to confirm our hypothesis that creating and managing machine learning datasets is still nascent and that there are a lot of problems to solve here.
In retrospect, ICCV 2019 was an enriching experience for Superb AI. We confirmed that there’s a significant need for our all-in-one ML data platform that fully supports dataset collection, annotation, management, and analysis. We received firm confirmation and reinforcement that our mission to simplify AI dev cycle and democratize AI is both justified and much-desired.