Submitted by Mae Rice on Mon, 06/24/2019 - 22:15

It's unclear when plain old “data” became “big data," but the latter term probably originated in 1990s Silicon Valley pitch meetings and lunch rooms. What's easier to pinpoint is how data has exploded in the 21st century.

The total amount of data recorded until 2003 was five exabytes, or one quintillion bytes. (A quintillion is a million, cubed.) In 2011 alone, recorded data weighed in at 1.8 zettabytes — almost a thousand times more. By 2020, according to one estimate, humans will produce on average 1.5 GB of data per day. Multiply that by 365 days and then again by a good chunk of the world's 7.5 billion-person population, and the volume is almost unfathomable immense.  

Big Data Analytics Platforms To Know

  • Microsoft Azure
  • Cloudera
  • Sisense
  • Collibra
  • Tableau
  • MapR
  • Qualtrics
  • Oracle
  • MongoDB
  • Datameer

Because the persistent gush of data from numerous sources is only growing more intense, lots of sophisticated and highly scalable big data analytics platforms — many of which are cloud-based — have popped up to parse the ever expanding mass of information.

We’ve rounded up the 31 big data platforms that make petabytes of data feel manageable.

 

sumo logic big data platform
Sumo Logic

Sumo Logic

Location: Redwood City, Calif.

What it does: The Cloud-native Sumo Logic platform offers apps — including Airbnb and Pokemon GO—three different types of support. It troubleshoots, tracks business analytics and catches security breaches, drawing on machine learning for maximum efficiency. It’s also flexible and able to manage sudden influxes of data. 

 

microsoft azure big data platform
Microsoft Azure

Microsoft Azure

Location: Seattle

What it does: Users can analyze data stored on Microsoft’s Cloud platform, Azure, with a broad spectrum of open-source Apache technologies, including Hadoop and Spark. Azure also features a native analytics tool, HDInsight, that streamlines data cluster analysis and integrates seamlessly with Azure's other data tools. 

 

cloudera big data platform
Cloudera

Cloudera 

Location: Palo Alto, Calif.

What it does: Rooted in Apache’s Hadoop, Cloudera can handle massive amounts of data. Clients routinely store more than 50 petabytes in Cloudera’s Data Warehouse, which can manage data including machine logs, text, and more. Meanwhile, Cloudera’s DataFlow—previously Hortonworks’ DataFlow—analyzes and prioritizes data in real time.

 

shyft data and analytics big data platform
SHYFT Data and Analytics Platform

SHYFT Data and Analytics Platform

Location: Boston

What it does: SHYFT designed its data analytics platform with the life science industries in mind. While keeping patient privacy in mind, the HIPAA- and PII-compliant tool automatically finds, imports and runs analytics on hundreds of data streams, blending them into a cohesive whole. The platform’s quick and slick data visualizations help users uncover unexpected correlations between datasets. 

 

google cloud big data platform
Google Cloud

Google Cloud

Location: Mountain View, Calif.

What it does: Google Cloud offers lots of big data management tools, each with its own specialty. BigQuery warehouses petabytes of data in an easily queried format. Cloud Dataflow analyzes ongoing data streams and batches of historical data side by side. With Google Data Studio, clients can turn varied data into custom graphics.

 

sisense big data platform
Sisense

Sisense 

Location: NYC

What it does: Sisense’s data analytics platform processes data swiftly thanks to its signature In-Chip Technology. The interface also lets client build, use and embed custom dashboards and analytics apps. After a recent merger, Sisense is poised to combine its platform with Periscope Data’s. The merger will allow users to simultaneously comb data repositories with SQL, Python and R.

 

collibras data governance center big data platform
Collibra

Collibra

Location: NYC

What it does: Designed to accommodate the needs of banking, healthcare and other data-heavy fields, Collibra lets employees companywide find quality, relevant data. The versatile platform features semantic search, which can find more relevant results by unraveling contextual meanings and pronoun referents in search phrases.

 

talend big data platform
Talend

Talend

Company location: San Francisco

What the platform does: Talend’s trio of big data integration platforms includes a free basic platform and two paid subscription platforms, all rooted in open-source tools like Apache Spark. The paid platforms, though—one designed for existing data, the other for real-time data streams—come with more power and tech support. Both can clean and parse data, delete duplicate data and detect fraud automatically, among other functions.

 

tableau big data platform
Tableau

Tableau

Company location: Austin, Texas

What the platform does: The Tableau platform—available on-premises or in the Cloud—allows users to find correlations, trends and unexpected interdependences between data sets. The Data Management add-on further enhances the platform, allowing for more granular data cataloging and the tracking of data lineage.

 

mapr big data platform
MapR

MapR 

Company location: Santa Clara, Calif.

What the platform does: MapR’s platform, which they term "dataware," has attracted customers like American Express and Samsung with its massive capacity (exabytes!) and robust security measures. But it's not a platform so much as a meta-platform—a dashboard for managing big data spread across various platforms, clouds, servers and edge-computing devices. Its interface offers users a 10,000-foot perspective on the totality of their data while letting them manage various data types in one place. 

 

qualtrics experience management big data platform
Qualtrics Experience Management

Qualtrics Experience Management

Location: Seattle 

What it does: Qualtrics’ platform lets companies assess the four key experiences that define their brand: customer experience; employee experience; product experience; and the brand experience, defined by marketing and brand awareness. Its analytics tools turn data on employee satisfaction, marketing campaign impact and more into actionable predictions rooted in machine learning and AI.

1010 datas 1010 edge big data platform
1010 Data's 1010 Edge

1010Data’s 1010Edge

Location: NYC

What it does: This scalable cloud-based big data platform compiles and unifies data for giant enterprises, including Bank of America and Coca-Cola. Along the way, it can pull in relevant third-party data — like conversion rates and buyer behavior intel — from 1010Reveal. The searchable platform efficiently processes multiple complex queries at once.

 

teradata big data platform
Teradata

Teradata

Location: San Diego, Calif.

What it does: Teradata’s Vantage analytics software works with various public cloud services, but users can also combine it with Teradata Cloud storage. This all-Teradata experience maximizes synergy between cloud hardware and Vantage’s machine learning and NewSQL engine capabilities. Teradata Cloud users also enjoy special perks—new Vantage features, for instance, are available on Teradata’s cloud before they're available to users of other cloud services.

 

oracle-big-data-platform
ORacle

Oracle 

Company location: Westminster, Colo.

What the platform does: Oracle Cloud’s big data platform can automatically migrate diverse data formats to cloud servers, purportedly with no downtime. The platform can also operate on-premise and in hybrid settings, enriching and transforming data whether it’s streaming in real time or stored in a centralized repository, aka "data lake." The platform comes in three formats, including basic and governance editions.

 

domo big data platform
Domo

Domo

Company location: American Fork, Utah

What the platform does: Domo’s big data platform draws on clients’ full data portfolios to offer industry-specific findings and AI-based predictions. Even when relevant data sprawls across multiple cloud servers and hard drives, Domo clients can gather it all in one place with Magic ETL, a drag-and-drop tool that streamlines the integration process.

 

mongodb-big-data-platform
MongoDB

MongoDB

Location: NYC

What it does: MongoDB doesn’t force data into spreadsheets. Instead, its Cloud-based platforms store data as flexible JSON documents—in other words, as digital objects that can be arranged in a variety ways, even nested inside each other. Designed for app developers, the platforms offer of-the-moment search functionality. For example, users can search their data for geotags and graphs as well as text phrases. 

 

civis platform big data platform
Civis Platform

Civis Platform

Location: Chicago

What it does: Civis Analytics’ cloud-based platform offers end-to-end data services, from data ingestion to modeling and reports. Designed with data scientists in mind, the platform integrates with GitHub to ease user collaboration and is purportedly ultra-secure—both HIPAA-compliant and SOC 2 Type II-certified. 

 

alteryx big data platform

Alteryx

Company location: Broomfield, Colo.

What the platform does: Alteryx’s designers built the company’s eponymous platform with simplicity and interdepartmental collaboration in mind. Its four interlocking tools allow users to create repeatable data workflows — stripping busywork from the data prep and analysis process— and deploy R and Python code within the platform for quicker predictive analytics.

 

zeta interactives marketing platform big data platform
Zeta Interactive's Marketing Platform

Zeta Interactive’s Marketing Platform 

Location: NYC

What it does: Designed for marketers, this platform from Zeta Interactive pulls data from three different clouds onto one dashboard. (One cloud is devoted to marketing, another to customer experience and a third to in-depth customer data culled from millions of user profiles with permission.) The platform’s AI features sift through the diverse data, helping marketers target key demographics and attract new customers. 

 

hewlett packard enterprises big data platform
Hewlett Packard Enterprise's Vertica

Hewlett Packard Enterprise’s Vertica 

Location: Palo Alto, Calif.

What it does: This software-only SQL data warehouse is storage system-agnostic. That means it can analyze data from cloud services, on-premise servers and any other data storage space. Vertica works quickly thanks to columnar storage, which facilitates the scanning of only relevant data. Its latest version offers predictive analytics rooted in machine learning for industries that include finance and marketing.

 

arm treasure data big data platform
Arm Treasure Data

Arm Treasure Data

Location: Mountain View, Calif.

What it does: Treasure Data’s customer data platform sorts morasses of web, mobile and IoT data into rich, individualized customer profiles so marketers can communicate with their desired demographics in a more tailored and personalized way. 

 

amazon web services big data platform
Amazon Web Services

Amazon Web Services

Location: Seattle

What it does: Best known as AWS, Amazon’s cloud-based platform comes with 11 analytics tools that are designed for everything from data prep and warehousing to SQL queries and data lake design. All the resources scale with your data as it grows in a secure cloud-based environment. Features include customizable encryption and the option of a virtual private cloud

 

actian avalanche big data platform
Actian Avalanche

Actian Avalanche

Location: San Francisco

What it does: Actian’s Cloud-native data warehouse, which debuted in March 2019, was built for near-instantaneous results — even if users run multiple queries at once. Backed by support from Microsoft and Amazon’s public clouds, it can analyze data in public and private Clouds. For easy app use, the platform comes with ready-made connections to Salesforce, Workday and others. 

 

pivotal greenplum big data platform
Pivotal Greenplum

Pivotal Greenplum 

Location: San Francisco

What it does: Born out of the open-source Greenplum Database project, this platform uses PostgreSQL to conquer varied data analysis and operations projects, from quests for business intelligence to deep learning. Pivotal Greenplum can parse data housed in clouds and servers, as well as container orchestration systems. Additionally, it comes with a built-in toolkit of extensions for location-based analysis, document extraction and multi-node analysis.

 

hitachi vantaras pentaho big data platforms
Hitachi Vantara's Pentaho

Hitachi Vantara’s Pentaho

Location: Orlando, Fla.

What it does: This platform streamlines the data ingestion process by foregoing hand coding and offering time-saving functions like drag-and-drop integration, pre-made data transformation templates and metadata injection. Once users add data, the platform can mine business intelligence from any data format thanks to its data-agnostic design.

 

exasol big data platform
Exasol

Exasol

Location: Nuremberg, Germany

What it does: This intelligent, in-memory analytics database was designed for speed, especially on clustered systems. It can analyze all types of data — including sensor, online transaction, location and more — via massive parallel processing. The cloud-first platform also analyzes data stored in appliances and can function purely as software.

 

ibm cloud big data platform
IBM Cl0ud

IBM Cloud

Location: Armonk, N.Y.

What it does: IBM’s full-stack cloud comes with 170 built-in tools, including more than 20 for customizable big data management. Users can opt for a NoSQL or SQL database, or store their data as JSON documents, among other database designs. The platform can also run in-memory analysis and integrate open-source tools like Apache Spark. 

 

marklogic big data platform
Mark Logic

MarkLogic

Location: San Carlos, Calif.

What it does: Users can import data into MarkLogic’s platform as is. Items ranging from images and videos to JSON and RDF files coexist peaceably in the flexible database, uploaded via a simple drag-and-drop process powered by Apache Nifi. Organized around MarkLogic’s Universal Index, files and metadata are easily queried. The database also integrates with a host of more intensive analytics apps.

 

datameer big data platform
Datameer

Datameer

Location: San Francisco, Calif.

What it does: Though it’s possible to code within Datameer’s platform, it’s not particularly necessary. Users can upload structured and unstructured data directly from more than 70 data sources by following a simple wizard. From there, the point-and-click data cleansing and built-in library of more than 270 functions — like chronological organization and custom binning —make it easy to drill into data even if users don't have a computer science background.

 

wavefront big data platform
Wavefront

Wavefront

Location: Palo Alto, Calif.

What it does: Designed for time-series data pulled from the likes of CollectD, JMX and Amazon Web Services, this platform specializes in spotting trends — and, more important, deviations from them. The latter capacity means that when something suspicious happens, users can send and receive intelligent alerts, activated by multi-dimensional criteria rather than simplistic thresholds. 

 

alibaba cloud big data platform
Alibaba Cloud

Alibaba Cloud

Location: Hangzhou, Zhejiang, China

What it does: The leading public cloud provider in China, Alibaba operates in 19 regions worldwide, including the U.S. Its popular cloud platform offers a variety of database formats and big data tools, including data warehousing, analytics for streaming data and speedy Elasticsearch, which can scan petabytes of data scattered across hundreds of servers in real time.

 

Images via Shutterstock, social media and company websites.

Great Companies Need Great People. That's Where We Come In.

Recruit With Us