Comet.ml nabs $4.5M for more efficient machine learning model management

As we get further along in the new way of working, the new normal if you will, finding more efficient ways to do just about everything is becoming paramount for companies looking at buying new software services. To that end, Comet.ml announced a $4.5 million investment today as it tries to build a more efficient machine learning platform.

The money came from existing investors Trilogy Equity Partners, Two Sigma Ventures and Founder’s Co-op. Today’s investment comes on top of an earlier $2.3 million seed.

“We provide a self-hosted and cloud-based meta machine learning platform, and we work with data science AI engineering teams to manage their work to try and explain and optimize their experiments and models,” company co-founder and CEO Gideon Mendels told TechCrunch.

In a growing field with lots of competitors, Mendels says his company’s ability to move easily between platforms is a key differentiator.

“We’re essentially infrastructure agnostic, so we work whether you’re training your models on your laptop, your private cluster or on many of the cloud providers. It doesn’t actually matter, and you can switch between them,” he explained.

The company has 10,000 users on its platform across a community product and a more advanced enterprise product that includes customers like Boeing, Google and Uber.

Mendels says Comet has been able to take advantage of the platform’s popularity to build models based on data customers have made publicly available. The first one involves predicting when a model begins to show training fatigue. The Comet model can see when this happening and signal data scientists to shut the model down 30% faster than this kind of fatigue would normally surface.

The company launched in Seattle at TechStars/Alexa in 2017. The community product debuted in 2018.

Human Capital is an engineering talent agency and a VC fund all in one

Michael Ovitz didn’t invent the idea of a talent agency, but one might argue that he perfected it. He founded the CAA in 1975, and grew it into the world’s leading talent agency, serving as chairman for 20 years. Now, Ovitz is investing in a brand new type of talent agency called Human Capital.

Human Capital is a hybrid organization, one part VC fund, one part recruiting business and one part creative agency. (Human Capital did not invest in its agency startup from its VC fund.) The Human Capital VC fund has $210 million in assets under management.

The Human Capital recruitment/agency company, founded by former General Catalyst associate Armaan Ali and Stanford grad Baris Akis, looks to provide for tech engineers the same services that Ovitz provided to actors and creatives back in the 70s, 80s and 90s. Engineers are some of the most sought-after talent in Silicon Valley and across the globe. And while big corporations and high-growth startups duke it out over these young engineers, the candidates themselves have little to no guidance around where they should go, what they should expect during the process, and, in some cases, what they should expect to earn.

Ovitz — alongside Qasar Younis, founder of Applied Intuition and former partner and COO of YC; Adam Zoia, founder and chairman of Glocap; Stephen Ehikian, co-founder and CEO of Airkit; and other financial institutions and LPs — recently injected $15 million into Human Capital, which is valued in the hundreds of millions according to the company.

Human Capital looks to pair the brightest engineers with the right company for them, while giving startups a new way to approach recruitment. Thus far, the company has 5,000 members (engineers) and has placed them at startups like Brex, Grammarly, Robinhood and more.

Human Capital starts by doing outreach on university campuses with outstanding engineering programs, setting up coffee with engineers who have been recommended or referred by alumni of the program. Once accepted as a member, the engineer explains to Human Capital what type of role they’re interested in, whether it’s at a big corporation, a high-growth startup or an early-stage company where they have the opportunity to build something from scratch.

The recruitment team at Human Capital then coaches the engineer through the interview process and beyond, helping with decision-making around promotions, understanding equity and negotiating new offers.

The org never charges the engineer, but rather takes a commission on the engineer’s annual income for the first year from the startup that recruited them.

Ali explained to TechCrunch how Human Capital is operating during the coronavirus pandemic, describing a situation in which the top talent that is in the market right now has a level of uncertainty about the future, leading them to seek positions at huge companies like Facebook and Google.

“Our hypothesis when we started this was that there are amazing businesses that are being run better at an earlier stage and have a proxy for that same type of stability [at a Google or Facebook] via their access to capital, alongside other foundational pieces of business security, such as their business model, unit economics, long-term vision for the company, gross margin rate, and growth opportunities for individuals at those companies.”

He said that Human Capital believed that, if a macro event occurred in the market place — we’re right in the middle of one of the least predictable and most impactful macro economic events ever — some of those “stable” earlier-stage businesses wouldn’t be hit in the same way as public companies who have to worry about short-term profitability.

“The issue is that you have to know a lot about those businesses in order to be able to discern that, and that’s our job,” said Ali. “And what we’ve seen is that a number of the companies in that position are actually ramping up recruiting right now.”

There is no mandatory link between Human Capital’s venture capital fund and their recruiting/agency entity, though the fund does like to invest in engineers who have gone through the program and move on to start their own businesses. Those types of investments include Brex, Bolt and Qualia, among others. Human Capital also invests in companies for whom they’ve recruited, such as Livongo, Snowflake, Clumio, Wildlife and Trackonomy. Human Capital has a preference for leading rounds only for companies that are started by its engineer members.

The model isn’t unlike SignalFire or Glocap, founded by Adam Zoia (investor in Human Capital). The idea is that VC funds are great for capital injections, but with the cut-throat recruiting atmosphere and a finite number of engineers, that money can be relatively useless if it can’t be used to bring on the best talent. So firms like SignalFire (in the tech world) and Glocap (in the business/finance world) put recruitment front and center in their value proposition. (Glocap doesn’t invest, but is the premier recruitment platform in the financial sector.)

Human Capital is also starting to look at potential acquisitions that can beef up its agency business, recently acqui-hiring Khonvo Corporation, a recruitment agency founded by Archit Bhise and Andrew Rising.

Ovitz explained to TechCrunch that his ultra-successful career as an agent stemmed from his ability to make decisions about people and projects quickly. He sees the same type of intuition in Ali and Akis at a much younger age and with less experience than he had.

“It’s a checklist in your head,” said Ovitz. “It’s a combination of when your brain meets your stomach, your intellect meets your gut that lets you know you’ve hit a winner. The thing that’s allowed Ali and Akis to build a company that’s worth the hundreds of millions in such a short period of time is that they had that when I met them without having an enormous amount of experience.”

He added that access to the internet, which he did not have during his agency days, is an amazing learning tool and an “epic crutch” that, when paired with good instincts, can accelerate the learning curve on building a business.

(It’s worth noting that this isn’t Ovitz’s first foray into Silicon Valley. The entertainment powerhouse was one of the earliest advisors to Marc Andreessen and Ben Horowitz during the formation of the legendary VC firm a16z, helping them model the firm after CAA itself. Ovitz has been quietly investing in and advising tech startups for the past 15 years.)

Granulate announces $12M Series A to optimize infrastructure performance

As companies increasingly look to find ways to cut costs, Granulate, an early-stage Israeli startup, has come up with a clever way to optimize infrastructure usage. Today it was rewarded with a tidy $12 million Series A investment.

Insight Partners led the round with participation from TLV Partners and Hetz Ventures. Lonne Jaffe, managing director at Insight Partners, will be joining the Granulate board under the terms of the agreement. Today’s investment brings the total raised to $15.6 million, according to the company.

The startup claims it can cut infrastructure costs, whether on-prem or in the cloud, from between 20% and 80%. This is not insignificant if they can pull this off, especially in the economic maelstrom in which we find ourselves.

Asaf Ezra, co-founder and CEO at Granulate, says the company achieved the efficiency through a lot of studying about how Linux virtual machines work. Over six months of experimentation, they simply moved the bottleneck around until they learned how to take advantage of the way the Linux kernel operates to gain massive efficiencies.

It turns out that Linux has been optimized for resource fairness, but Granulate’s founders wanted to flip this idea on its head and look for repetitiveness, concentrating on one function instead of fair allocation across many functions, some of which might not really need access at any given moment.

“When it comes to production systems, you have a lot of repetitiveness in the machine, and you basically want it to do one thing really well,” he said.

He points out that it doesn’t even have to be a VM. It could also be a container or a pod in Kubernetes. The important thing to remember is that you no longer care about the interactivity and fairness inherent in Linux; instead, you want that the machine to be optimized for certain things.

“You let us know what your utility function for that production system is, then our agents. basically optimize all the decision making for that utility function. That means that you don’t even have to do any code changes to gain the benefit,” Ezra explained.

What’s more, the solution uses machine learning to help understand how the different utility functions work to provide greater optimization to improve performance even more over time.

Insight’s Jaffe certainly recognized the potential of such a solution, especially right now.

“The need to have high-performance digital experiences and lower infrastructure costs has never been more important, and Granulate has a highly differentiated offering powered by machine learning that’s not dependent on configuration management or cloud resource purchasing solutions,” Jaffe said in a statement.

Ezra understands that a product like his could be particularly helpful at the moment. “We’re in a unique position. Our offering right now helps organizations survive the downturn by saving costs without firing people,” he said.

The company was founded in 2018 and currently has 20 employees. They plan to double that by the end of 2020.

Google Meet launches improved Zoom-like tiled layout, low-light mode and more

Google Meet, like all video chat products, is seeing rapid growth in user numbers right now, so it’s no surprise that Google is trying to capitalize on this and is quickly iterating on its product. Today, it is officially launching a set of new features that include a more Zoom-like tiled layout, a low-light mode for when you have to make calls at night and the ability to present a single Chrome tab instead of a specific window or your entire screen. Soon, Meet will also get built-in noise cancellation so nobody will hear your dog bark in the background.

If all of this sounds a bit familiar, it’s probably because G Suite exec Javier Soltero already talked to Reuters about these features last week. Google PR is usually pretty straightforward, but in this case, it moved in mysterious ways. Today, though, these features are actually starting to roll out to users, a Google spokesperson told me, and today’s announcement does actually provide more details about each of these features.

For the most part, what’s being announced here is obvious. The tiled layout allows web users to see up to 16 participants at once. Previously, that number was limited to four and Google promises it will offer additional layouts for larger meetings and better presentation layouts, as well as support for more devices in the future.

For the most part, having this many people stare at me from my screen doesn’t seem necessary (and more likely to induce stress than anything else), but the ability to present a single Chrome tab is surely a welcome new feature for many. But what’s probably just as important is that this means you can share higher-quality video content from these tabs than before.

If you often take meetings in the dark, low-light mode uses AI to brighten up your video. Unlike some of the other features, this one is coming to mobile first and will come to web users in the future.

Personally, I’m most excited about the new noise cancellation feature. Typically, noise cancellation works best for noises that repeat and are predictable. Think about the constant drone of an airplane or your neighbor’s old lawnmower. But Google says Meet can now go beyond this and also cancel out barking dogs and your noisy keystrokes. That has increasingly become table stakes, with even Discord offering similar capabilities and Nvidia RTX Voice now making this available in a slew of applications for users of its high-end graphics cards, but it’s nice to see this as a built-in feature for Meet now.

This feature will only roll out in the coming weeks and will initially be available to G Suite Enterprise and G Suite Enterprise for Education users on the web, with mobile support coming later.

Google Cloud’s fully managed Anthos is now generally available for AWS

A year ago, back in the days of in-person conferences, Google officially announced the launch of its Anthos multi-cloud application modernization platform at its Cloud Next conference. The promise of Anthos was always that it would allow enterprises to write their applications once, package them into containers and then manage their multi-cloud deployments across GCP, AWS, Azure and their on-prem data centers.

Until now, support for AWS and Azure was only available in preview, but today, the company is making support for AWS and on-premises generally available. Microsoft Azure support remains in preview, though.

“As an AWS customer now, or a GCP customer, or a multi-cloud customer, […] you can now run Anthos on those environments in a consistent way, so you don’t have to learn any proprietary APIs and be locked in,” Eyal Manor, the GM and VP of engineering in charge of Anthos, told me. “And for the first time, we enable the portability between different infrastructure environments as opposed to what has happened in the past where you were locked into a set of APIs.”

Manor stressed that Anthos was designed to be multi-cloud from day one. As for why AWS support is launching ahead of Azure, Manor said that there was simply more demand for it. “We surveyed the customers and they said, ‘hey, we want, in addition to GCP, we want AWS,’ ” he said. But support for Azure will come later this year and the company already has a number of preview customers for it. In addition, Anthos will also come to bare metal servers in the future.

Looking even further ahead, Manor also noted that better support for machine learning workloads is on the way. Many businesses, after all, want to be able to update and run their models right where their data resides, no matter what cloud that may be. There, too, the promise of Anthos is that developers can write the application once and then run it anywhere.

“I think a lot of the initial response and excitement was from the developer audiences,” Jennifer Lin, Google Cloud’s VP of product management, told me. “Eric Brewer had led a white paper that we did to say that a lot of the Anthos architecture sort of decouples the developer and the operator stakeholder concerns. There hadn’t been a multi-cloud shared software architecture where we could do that and still drive emerging and existing applications with a common shared software stack.”

She also noted that a lot of Google Cloud’s ecosystem partners endorsed the overall Anthos architecture early on because they, too, wanted to be able to write once and run anywhere — and so do their customers.

Plaid is one of the launch partners for these new capabilities. “Our customers rely on us to be always available and as a result we have very high reliability requirements,” said Naohiko Takemura, Plaid’s head of engineering. “We pursued a multi-cloud strategy to ensure redundancy for our critical KARTE service. Google Cloud’s Anthos works seamlessly across GCP and our other cloud providers preventing any business disruption. Thanks to Anthos, we prevent vendor lock-in, avoid managing cloud-specific infrastructure, and our developers are not constrained by cloud providers.”

With this release, Google Cloud is also bringing deeper support for virtual machines to Anthos, as well as improved policy and configuration management.

Over the next few months, the Anthos Service Mesh will also add support for applications that run in traditional virtual machines. As Lin told me, “a lot of this is is about driving better agility and taking the complexity out of it so that we have abstractions that work across any environment, whether it’s legacy or new or on-prem or AWS or GCP.”

AWS launches Amazon AppFlow, its new SaaS integration service

AWS today launched Amazon AppFlow, a new integration service that makes it easier for developers to transfer data between AWS and SaaS applications like Google Analytics, Marketo, Salesforce, ServiceNow, Slack, Snowflake and Zendesk. Like similar services, including Microsoft Azure’s Power Automate, for example, developers can trigger these flows based on specific events, at pre-set times or on-demand.

Unlike some of its competitors, though, AWS is positioning this service more as a data transfer service than a way to automate workflows, and, while the data flow can be bi-directional, AWS’s announcement focuses mostly on moving data from SaaS applications to other AWS services for further analysis. For this, AppFlow also includes a number of tools for transforming the data as it moves through the service.

“Developers spend huge amounts of time writing custom integrations so they can pass data between SaaS applications and AWS services so that it can be analysed; these can be expensive and can often take months to complete,” said AWS principal advocate Martin Beeby in today’s announcement. “If data requirements change, then costly and complicated modifications have to be made to the integrations. Companies that don’t have the luxury of engineering resources might find themselves manually importing and exporting data from applications, which is time-consuming, risks data leakage, and has the potential to introduce human error.”

Every flow (which AWS defines as a call to a source application to transfer data to a destination) costs $0.001 per run, though, in typical AWS fashion, there’s also cost associated with data processing (starting at 0.02 per GB).

“Our customers tell us that they love having the ability to store, process, and analyze their data in AWS. They also use a variety of third-party SaaS applications, and they tell us that it can be difficult to manage the flow of data between AWS and these applications,” said Kurt Kufeld, vice president, AWS. “Amazon AppFlow provides an intuitive and easy way for customers to combine data from AWS and SaaS applications without moving it across the public internet. With Amazon AppFlow, our customers bring together and manage petabytes, even exabytes, of data spread across all of their applications — all without having to develop custom connectors or manage underlying API and network connectivity.”

At this point, the number of supported services remains comparatively low, with only 14 possible sources and four destinations (Amazon Redshift and S3, as well as Salesforce and Snowflake). Sometimes, depending on the source you select, the only possible destination is Amazon’s S3 storage service.

Over time, the number of integrations will surely increase, but for now, it feels like there’s still quite a bit more work to do for the AppFlow team to expand the list of supported services.

AWS has long left this market to competitors, even though it has tools like AWS Step Functions for building serverless workflows across AWS services and EventBridge for connections applications. Interestingly, EventBridge currently supports a far wider range of third-party sources, but as the name implies, its focus is more on triggering events in AWS than moving data between applications.

ForgeRock nabs $93.5M for its ID management platform, gears up next for an IPO

For better or worse, digital identity management services — the process of identifying and authenticating users on networks to access services — has become a ubiquitous part of interacting on the internet, all the more so in the recent weeks as we have been asked to carry out increasingly more of our lives online.

Used correctly, they help ensure that it’s really you logging into your online banking service; used badly, you feel like you can’t innocently watch something silly on YouTube without being watched yourself. Altogether, they are a huge business: worth $16 billion today according to Gartner but growing at upwards of 30% and potentially as big as $30.5 billion by 2024, according to the latest forecasts.

Now, a company called ForgeRock, which has built a platform that is used to help make sure that those accessing services really are who they say are, and help organizations account for how their services are getting used, is announcing a big round of funding to continue expanding its business amid a huge boost in demand.

The company is today announcing that it has raised $93.5 million in funding, a Series E it will use to continue expanding its product and take it to its next step as a business, specifically investing in R&D, cloud services and its ForgeRock Identity Cloud, and general global business development.

The round is being led by Riverwood Capital, and Accenture Ventures, as well as previous investors Accel, Meritech Capital, Foundation Capital and KKR Growth, also participated.

Fran Rosch, the startup’s CEO, said in an interview that this will likely be its final round of funding ahead of an IPO, although given the current static of affairs with a lot of M&A, there is no timing set for when that might happen. (Notably, the company had said its last round of funding — $88 million in 2017 — would be its final ahead of an IPO, although that was under a different CEO.)

This Series E brings the total raised by the company to $230 million. Rosch confirmed it was raised as a material upround, although he declined to give a valuation. For some context, the company’s last post-money valuation was $646.50 million per PitchBook, and so this round values the company at more than $730 million.

ForgeRock has annual recurring revenues of more than $100 million, with annual revenues also at over $100 million, Rosch said. It operates in an industry heavy with competition, with some of the others vying for pole position in the various aspects of identity management including Okta, LastPass, Duo Serurity and Ping Identity.

But within that list it has amassed some impressive traction. In total it has 1,100 enterprise customers, who in turn collectively manage 2 billion identities through ForgeRock’s platform, with considerably more devices also authenticated and managed on top of that.

Customers include the likes of the BBC — which uses ForgeRock to authenticate and log not just 45 million users but also the devices they use to access its iPlayer on-demand video streaming service — Comcast, a number of major banks, the European Union and several other government organizations. ForgeRock was originally founded in Norway about a decade ago, and while it now has its headquarters in San Francisco, it still has about half its employees and half its customers on the other side of the Atlantic.

Currently ForgeRock provides services to businesses related to identity management including password and username creation, identity governance, directory services, privacy and consent gates, which they in turn provide both to their human customers as well as to devices accessing their services, but we’re in a period of change right now when it comes to identity management. It stays away from direct-to-consumer password management services and Rosch said there are no plans to move into that area.

These days, we’ve become more aware of privacy and data protection. Sometimes, it’s been because of the wrong reasons, such as giant security breaches that have leaked some aspect of our personal information into a giant database, or because of a news story that has uncovered how our information has unwittingly been used in ‘legit’ commercial schemes, or other ways we never imagined it would.

Those developments, combined with advances in technology, are very likely to lead us to a place over time where identity management will become significantly more shielded from misuse. These could include more ubiquitous use of federated identities, “lockers” that store our authentication credentials that can be used to log into services but remain separate from their control, and potentially even applications of blockchain technology.

All of this means that while a company like ForgeRock will continue to provide its current services, it’s also investing big in what it believes will be the next steps that we’ll take as an industry, and society, when it comes to digital identity management — something that has had a boost of late.

“There are a lot of interesting things going on, and we are working closely behind the scenes to flesh them out,” Rosch said. “For example, we’re looking at how best to break up data links where we control identities to get access for a temporary period of time but then pull back. It’s a powerful trend that is still about four to five years out. But we are preparing for this, a time when our platform can consume decentralised identity, on par with logins from Google or Facebook today. That is an interesting area.”

He notes that the current market, where there has been an overall surge for all online services as people are staying home to slow the speed of the coronavirus pandemic, has seen big boosts in specific verticals.

Its largest financial services and banking customers have seen traffic up by 50%, and digital streaming has been up by 300% — with customers like the BBC seeing spikes in usage at 5pm every day (at the time of the government COVID-19 briefing) that are as high as its most popular primetime shows or sporting events — and use of government services has also been surging, in part because many services that hadn’t been online are now developing online presences or seeing much more traffic from digital channels than before. Unsurprisingly, its customers in hotel and travel, as well as retail, have seen drops, he added.

“ForgeRock’s comprehensive platform is very well-positioned to capitalize on the enormous opportunity in the Identity & Access Management market,” said Jeff Parks, co-founder and managing partner of Riverwood Capital, in a statement. “ForgeRock is the leader in solving a wide range of workforce and consumer identity use cases for the Global 2000 and is trusted by some of the largest companies to manage millions of user identities. We have seen the growth acceleration and are thrilled to partner with this leadership team.” Parks is joining the board with this round.

Confluent lands another big round with $250M Series E on $4.5B valuation

The pandemic may feel all-encompassing at the moment, but Confluent announced a $250 million Series E today, showing that major investment continues in spite of the dire economic situation at the moment. The company is now valued at $4.5 billion.

Today’s round follows last year’s $125 million Series D. At that point the company was valued at a mere $2.5 billion. Investors obviously see a lot of potential here.

Coatue Management led the round, with help from Altimeter Capital and Franklin Templeton. Existing investors Index Ventures and Sequoia Capital also participated. Today’s investment brings the total raised to $456 million.

The company is based on Apache Kafka, the open-source streaming data project that emerged from LinkedIn in 2011. Confluent launched in 2014 and has gained steam, funding and gaudy valuations along the way.

CEO and co-founder Jay Kreps reports that growth continued last year when sales grew 100% over the previous year. A big part of that is the cloud product the company launched in 2017. It added a free tier last September, which feels pretty prescient right about now.

But the company isn’t making money giving stuff away, so much as attracting users, who can become customers at some point as they make their way through the sales funnel. The beauty of the cloud product is that you can buy by the sip.

The company has big plans for the product this year. Although Kreps was loath to go into detail, he says that there will be a series of changes coming up this year that will add significantly to the product’s capabilities.

“As part of this we’re going to have a major new set of capabilities for our cloud service, and for open-source Kafka, and for our product that we’re going to announce every month for the rest of the year,” Kreps told TechCrunch. These will start rolling out the first week in May.

While he wouldn’t get specific, he says that it relates to the changing nature of cloud infrastructure deployment. “This whole infrastructure area is really evolving as it moves to the cloud. And so it has to become much, much more elastic and scalable as it really changes how it works. And we’re going to have announcements around what we think are the core capabilities of event streaming in the cloud,” he said.

While a round this big with a valuation this high and an institutional investor like Franklin Templeton involved typically means an IPO could be the next step, Kreps was not ready to talk about that, except to say the company does plan to begin behaving in the cadence of a public company with a set of quarterly earnings, just not for public consumption yet.

The company was founded in 2014. It has 1,000 employees and has plans to continue to hire and to expand the product. Kreps sees plenty of opportunity here in spite of the current economics.

“I don’t think you want to just turtle up and hang on to your existing customers and not expand if you’re in a market that’s really growing. What really got this round of investors excited is the fact that we’re onto something that has a huge market, and we want to continue to advance, even in these really weird uncertain times,” he said.

Env0 announces $3.3M seed to bring more control to Infrastructure as Code

Env0, a startup that wants to help companies bring some order to delivery of Infrastructure as Code, announced a $3.3 million seed investment today and the release of the Beta of the company’s first product.

Boldstart Ventures and Grove Ventures co-led the round with participation from several angel investors including Guy Podjarny of Snyk.

Company co-founder and CEO Ohad Maislish says the ability of developers to deliver code quickly is a blessing and a curse, and his company wants to give IT some control over how and when code gets committed.

“The challenge companies have is how to balance between self-service and oversight of cloud resources in a cloud native kind of way, and to balance this with visibility, predictability, and most importantly, governance around cloud security and costs,” Maislish said.

The product lets companies define when it’s OK for developers to deliver code and how much they can spend instead of letting them deliver anything, at any time, at any cost. You do this by giving overall control of the process to an administrator, who can then define templates and projects. The templates define which repositories and products you can use for a given cloud vendor and the projects correlate to the users allowed to access those templates.

Image Credit: Env0

Ed Sim, founder and managing partner at Boldstart says the startup has been able to find a good balance between governance and the need for speed that today’s developers require in a continuous delivery environment. “Env0 is the first SaaS solution that meets all of those needs by offering self-service cloud environments with centralized governance,” Sim said in a statement.

It’s not easy launching an early-stage company in the middle of the current economic situation, but Maislish believes his company is in a decent position as it provides a way to control self-service development, something that is even more important when your developers are working from home outside of the purview of IT and security.

The company launched 18 months ago and has been in private beta for some time. Today marks the launch of the public beta. It currently has 10 employees.

AWS and Facebook launch an open-source model server for PyTorch

AWS and Facebook today announced two new open-source projects around PyTorch, the popular open-source machine learning framework. The first of these is TorchServe, a model-serving framework for PyTorch that will make it easier for developers to put their models into production. The other is TorchElastic, a library that makes it easier for developers to build fault-tolerant training jobs on Kubernetes clusters, including AWS’s EC2 spot instances and Elastic Kubernetes Service.

In many ways, the two companies are taking what they have learned from running their own machine learning systems at scale and are putting this into the project. For AWS, that’s mostly SageMaker, the company’s machine learning platform, but as Bratin Saha, AWS VP and GM for Machine Learning Services, told me, the work on PyTorch was mostly motivated by requests from the community. And while there are obviously other model servers like TensorFlow Serving and the Multi Model Server available today, Saha argues that it would be hard to optimize those for PyTorch.

“If we tried to take some other model server, we would not be able to quote optimize it as much, as well as create it within the nuances of how PyTorch developers like to see this,” he said. AWS has lots of experience in running its own model servers for SageMaker that can handle multiple frameworks, but the community was asking for a model server that was tailored toward how they work. That also meant adapting the server’s API to what PyTorch developers expect from their framework of choice, for example.

As Saha told me, the server that AWS and Facebook are now launching as open source is similar to what AWS is using internally. “It’s quite close,” he said. “We actually started with what we had internally for one of our model servers and then put it out to the community, worked closely with Facebook, to iterate and get feedback — and then modified it so it’s quite close.”

Bill Jia, Facebook’s VP of AI Infrastructure, also told me, he’s very happy about how his team and the community has pushed PyTorch forward in recent years. “If you look at the entire industry community — a large number of researchers and enterprise users are using AWS,” he said. “And then we figured out if we can collaborate with AWS and push PyTorch together, then Facebook and AWS can get a lot of benefits, but more so, all the users can get a lot of benefits from PyTorch. That’s our reason for why we wanted to collaborate with AWS.”

As for TorchElastic, the focus here is on allowing developers to create training systems that can work on large distributed Kubernetes clusters where you might want to use cheaper spot instances. Those are preemptible, though, so your system has to be able to handle that, while traditionally, machine learning training frameworks often expect a system where the number of instances stays the same throughout the process. That, too, is something AWS originally built for SageMaker. There, it’s fully managed by AWS, though, so developers never have to think about it. For developers who want more control over their dynamic training systems or to stay very close to the metal, TorchElastic now allows them to recreate this experience on their own Kubernetes clusters.

AWS has a bit of a reputation when it comes to open source and its engagement with the open-source community. In this case, though, it’s nice to see AWS lead the way to bring some of its own work on building model servers, for example, to the PyTorch community. In the machine learning ecosystem, that’s very much expected, and Saha stressed that AWS has long engaged with the community as one of the main contributors to MXNet and through its contributions to projects like Jupyter, TensorFlow and libraries like NumPy.