Facebook is releasing the hardware design for a server it uses to train artificial intelligence software.
Code-named Big Sur, Facebook uses the server to run its machine learning programs, a type of AI software that “learns” and gets better at tasks over time. Facebook are contributing Big Sur to the Open Compute Project.
One use for machine learning is image recognition, but it’s being used in all kinds of data sets, to identify things like email spam and credit card fraud.
Facebook, Google and Microsoft are all pushing hard at AI, which helps them build smarter online services.
Big Sur relies on GPUs, which are often more efficient than CPUs for machine learning tasks. It can have as many as eight high-performance GPUs that each consume up to 300 watts, and can be configured in a variety of ways via PCIe.
Facebook said the GPU-based system is twice as fast as its previous generation of hardware. “And distributing training across eight GPUs allows us to scale the size and speed of our networks by another factor of two,” it said in a blog post Thursday.
One thing about Big Sur is that it doesn’t require special cooling or other “unique infrastructure,” Facebook said. High performance computers generate a lot of heat, and keeping them cool can be costly. Some are even immersed in exotic liquids to stop them overheating.
Big Sur doesn’t need any of that, according to Facebook. It hasn’t released the hardware specs yet, but images show a large airflow unit inside the server that presumably contains fans that blow cool air across the components. Facebook says it can use the servers in its air-cooled data centers, which avoid industrial cooling systems to keep costs down.
Like a lot of other Open Compute hardware, it’s designed to be as simple as possible. OCP members are fond of talking about the “gratuitous differentiation” that server vendors put in their products, which can drive up costs and make it harder to manage equipment from different vendors.
“We’ve removed the components that don’t get used very much, and components that fail relatively frequently — such as hard drives and DIMMs — can now be removed and replaced in a few seconds,” Facebook said. All the handles and levers that technicians are supposed to touch are colored green, so the machines can be serviced quickly, and even the motherboard can be removed within a minute. “In fact, Big Sur is almost entirely tool-less –the CPU heat sinks are the only things you need a screwdriver for” Facebook says.
Google is also rolling out machine learning across more of its services. “Machine learning is a core, trans-formative way by which we’re rethinking everything we’re doing,” Google CEO Sundar Pichai said