When I came to the machine learning space from software engineering in 2016, I was surprised by the messy experimentation practices, lack of control over model building, and a missing ecosystem of tools to help people deliver models confidently.
It was a stark contrast from the software development ecosystem, where you have mature tools for DevOps, observability, or orchestration to execute efficiently in production.
Seeing that led me to start Neptune.ai with a few friends back in 2017 to give ML practitioners the same level of confidence when developing and deploying models as software devs have when shipping apps.
A lot has changed since then:
- the transformers and GPT-3 were created,
- Pytorch became a standard,
- Theano was deprecated and then came back again,
- the term “MLOps” was coined and then became popular.
Most importantly, the ML community realized that building a POC model in a notebook is not the end goal.
Today, companies big and small deploy and operate those models in production.
By no means are we at a “develop and deploy models confidently” stage just yet, but we’ve made huge progress as a community.
Speaking of progress, I am really happy to share that we’ve just raised an $8M Series A to continue building Neptune.ai.
Almaz Capital led the round with participation from our existing investors: btov Partners, Rheingau Founders, and TDJ Pitango.
We’ve gone such a long way over these last few years. Today we have:
- tens of thousands of users,
- hundreds of paying teams,
- places like CB Insights list us as a “Top 100 AI startup in 2021”.
As a Polish engineer at heart, there is only one way to express how I feel: not bad.
I am very grateful to:
- all the users and customers for invaluable feedback and support,
- the team for putting in their best effort every day,
- investors for believing in our vision.
While most companies in the MLOps space try to go wider and become platforms that solve all the problems of machine learning teams, we want to go deeper and become the best-in-class tool for experiment tracking and model registry.
We want to solve “just” this one part of the MLOps stack really well.
Why just one?
In a more mature software development space, there are almost no end-to-end platforms. So why should machine learning, which is even more complex, be any different?
I believe that by focusing on providing a great developer experience for experiment tracking and model registry, we can become one of the pillars on which teams build their MLOps tool stacks.
And to make this happen, we will invest a big chunk of that $8M in developer experience. Expect:
- more features built for specific ML use cases,
- even more responsive UI & APIs,
- revamped UX of our web UI,
- more integrations with the tools from the MLOps ecosystem,
- new ways of interacting via webhooks and notifications,
- better documentation,
- quicker feedback to feature loops.
But first and foremost, we’ll continue making experiment tracking and model registry “just work” for ML teams around the world.
If you’re interested in joining us, checking out the tool, or sharing feedback, I’d love to hear from you:
Hypefactors Case Study: Metadata Management for Teams that Don’t Run Experiments All the Time
7 mins read | Updated October 25th, 2021
Hypefactors is a technology company that works in the media intelligence and reputation tracking domain. They provide an ML-based Public Relations (PR) automation platform, including all the tools to power the PR workflow and visualize the results.
We spoke to the CTO of Hypefactors, Viet Yen Nguyen who leads the technical teams and is responsible for the technical side of the company. He explains this business as:
“Apple Inc. is worth approximately 2 trillion but if you go to their warehouses, stores, and sum up everything they own you don’t get to 2 trillion. You may get to maybe 60 billion worth of physical goods.
So where is the 1.94 trillion of value that is reflected in the price of the stock? It is the future prospects, the brand potential, the reputation, the intangible things.
We help teams actually measure and track that.”
What is the project about?
The data pipelines at Hypefactors monitor the whole media landscape ranging from their social media pipelines to print media, television, radio, etc, to analyze changes in their customers’ brand reputation. This feat is achieved in two phases:
- 1Getting the data from everywhere
- 2Enriching the data with ML-based features
In order to analyze every form of data ranging from image, text, and tabular, they work on a variety of ML problems:
- NLP classification
- computer vision (e.g. segmentation)
- regression for business metrics
As they train and improve many enrichment models using different ML techniques, this naturally involves running many experiments and articulating ways to store the metadata generated by those experiments.
With competent engineers in the data and AI teams, Viet’s team was able to achieve good results with all the components in the pipeline except for Experiment Tracking. Let’s peek into the problems they faced during the process and what measures they took to solve them.