72 Big Data Companies to Know

To unpack big data is to unpack big solutions — check out the companies maneuvering data on a massive scale.

Written by Alyssa Schroer
72 Big Data Companies to Know
Image: Shutterstock
UPDATED BY
Brennan Whitfield | Mar 11, 2024

The term big data is used in the business and tech world pretty frequently. In a nutshell, it’s the process of taking very large sets of complex data from multiple channels and analyzing it to find patterns and problems, all with the goal of gaining actionable insights. Big data is very valuable, but also a lot to handle for traditional software — which is where professionals come in to unravel it all.

Top Big Data Companies To Know

  • Alteryx
  • FourKites
  • Google
  • IBM
  • Oracle
  • Salesforce  
  • SAP
  • Splunk

A number of companies have emerged to provide ways to wrangle huge datasets and understand the relevant information within them. Some offer powerful data analysis tools, while others aggregate and organize datasets into charts, graphs and other data visualization formats. 

The following companies all work with big data in some way, enabling other organizations to make better sense of their businesses and take new steps toward problem-solving.

 

72 Big Data Companies to Know

Location: Boston, Massachusetts

More than 350 brands representing industries ranging from financial services to media and publishing use BlueConic’s customer data platform to provide them with actionable insights based on consented first-party data. Its customizable solutions serve marketing, data science and other growth-focused teams from companies such as Heineken, Hearst, Mattel and Franklin Sports.

 

Location: Fully Remote

Arity works to bring greater safety and efficiency to the transportation industry by using data and analytics to build solutions with applications for auto insurance providers, marketers, public sector entities and mobile app developers. For example, its crash detection solution relies on proprietary algorithms to enable mobile apps to detect vehicle collisions so that emergency services can quickly be notified to respond and app users can be connected to other helpful resources.

 

Location: New York, New York

Caden serves as a two-sided data marketplace. Its mobile app allows consumers to securely control and monetize Caden’s access to their personal data. The company offers businesses a suite of products and solutions that gives them access to its intelligent Knowledge Graph, which is a compilation of consumer behavioral data gathered with the explicit consent of Caden users.

 

Location: Fully Remote

Arcadia is a healthtech company offering a cloud-based data platform that’s meant to drive better patient experiences and outcomes. Its analytics capabilities deliver insights at the right time to improve care quality in a way that can help to cut down on risks and costs. The company’s technology covers a variety of use cases, including care management, patient engagement, healthcare IT, health equity and patient retention.

 

Location: Fully Remote

Monte Carlo Data is a data observability company with an end-to-end platform that is built to prevent, detect and resolve data downtime. It features metrics that assess and describe the health and quality of client companies’ data assets, bringing attention to areas of concern and rating their severity and urgency so that engineers can efficiently resolve them.

 

Location: San Francisco, California

Samsara offers a suite of IoT solutions for commercial fleet management. The company says it has over a million devices deployed to vehicles around the world and these devices send billions of data points to the Samsara Cloud each day. The company then processes trillions of video, image and time series data points each year to improve their offerings so customers can have the most up-to-date real-time information for fleet optimization and vehicle safety.  

 

Location: New York, New York

Enigma provides corporate intelligence in the form of data on small- and medium-sized businesses. This information, which includes firmographic, identity and financial wellbeing data, can be hard to come by but crucial to know in business dealings. Enigma supplies data to its clients to serve them in risk monitoring, marketing and sales, onboarding and other internal operations. 

 

Location: Chicago, Illinois

Analytics8 is a business intelligence company that takes big data sets and develops strategy, management, analytics and governance solutions that make data usable for its clients. The company also offers modernization services that migrate data from legacy systems and initiate modern cloud-based collection and storage processes.

 

Location: Fully Remote

Sensor Tower is an app intelligence company, and one of the better-known names in big data, generating reliable insights into global digital market trends. Using both human data scientists and proprietary algorithms to process data from millions of sources, it offers user acquisition strategies, app store optimization, competitive analysis and app-specific data in the brand, gaming, finance, adtech and marketing spaces. The company also boasts a range of intelligence products.

 

Location: San Francisco, California  

The Splunk platform is designed to be a comprehensive data hub, able to manage data insights, data security and full-stack scalability. For navigating big data, Splunk allows the congregation of hundreds of terabytes of data from databases, servers and more all into one platform interface. Users can gain insights across real-time and historical data records, as well as integrate structured data from relational databases for extended data analytics.

 

Location: Boston, Massachusetts

The Veeva software supplies cloud-based and data business solutions for the healthcare industry. With the help of its development cloud program, Veeva can analyze the market and expedite new medicines to consumers. The company is a provider for businesses in the fields of regulation, research sites and more.

 

Location: Menlo Park, California

Founded in 2015, Ascend sought to establish a system that eliminated data maintenance and optimized engineering pipelines. The company succeed with the creation of the Data Automation Cloud. The product allows businesses to harness the key capabilities of data engineering, like transformation and observability, ten times faster and in one platform.

 

Location: Fully Remote

Jebbit’s no-code software allows brands to easily create quizzes that increase customer engagement and capture data. Brands can use pre-made templates or customize their quiz interfaces with Jebbit’s tools. And beyond quizzes, Jebbit’s software can facilitate interactive editorial content, voting and live polling, trivia and lookbooks.

 

Location: Irvine, California

MFour Mobile is an all-in-one platform for market research. Whether it’s used for audience polling or on-demand surveys, the MFour Studio platform gathers information and pulls all the data into one unified interface. That way, consumer insights are more easily understood, which can improve decisions regarding advertisements, product development and more.

 

Location: Denver, Colorado

Lighthouse (formerly OTA Insight) is a cloud-based intelligence platform that works with hotel partners. The company aims to help partners understand real-time data that can be leveraged into actionable decisions regarding marketing, distribution and revenue. The data provided includes past, present and future analytics using datasets. 

 

Location: Fully Remote

Prefect is hoping to be the new standard for dataflow automation. The cloud-native platform delivers confidence to data engineers and data scientists through its collaboration hub, privacy assurance, deploying data with no configuration and visibility features. Additionally, professionals can operate continuous workflows, event-based workflows, versioning and more.

 

Location: Fully Remote

Founded in 2020, Vendia aims to empower businesses and professionals by getting datasets unstuck from various silos. Vendia’s cloud-based platform sends real-time data and can be shared with multiple parties through serverless ledgers, smart APIs, access control and more. The company’s platform is used by several industries including automotive, nonprofits and travel. Vendia’s ability to provide data quickly helps businesses redirect time and effort to generating a positive customer experience.

 

Location: Fully Remote 

The supply chain industry remains a complex field, so Craft is bringing much-needed clarity to customers. Teams can leverage Craft’s supplier intelligence platform to research companies, survey market landscapes, and monitor their supply chain operations. Because Craft provides a bird’s-eye view of all the details, companies can make more informed decisions and protect their finances. 

 

Location: Fully Remote 

Argyle is giving employees and trusted parties increased access to employee information to create faster transactions. Whenever employees need to verify their information or take care of financial issues, they can provide access through a digital platform simply by logging in to their employee account. Transferring money and checking info takes less time with the technology of Argyle.

 

Location: Fully Remote

Airbyte is a company that provides seamless data integration software. The platform gives customers the ability to manage their data integration tools under one system. Its foundation operates on an open-source model, meaning customers can create new data integration tools or customize pre-built tools that are already on the platform. Additionally, Airbyte offers data monitoring for accuracy and data security. 

Find out who's hiring.
See all jobs at top tech companies & startups
View 10000+ Jobs

 

Location: New York, New York

Applecart is helping companies target and influence consumers through interpersonal relationships. The company developed a Social Graph Platform that reviews public information worldwide to discover connections between people. From there, businesses can establish custom advertising groups and tailor their message to each group individually.

 

Location: Glendale, California 

DISQO works with brands, publishers, agencies and market researchers to provide audience insights acquired through open sharing and transparency. The platform uses API integrations and managed services to track brand insights and customer behavioral data across the web, going beyond surveys and diving into key deciding factors that determine shopping and search habits across channels and demographics.

 

Location: Chicago, Illinois

FourKites provides a real-time supply chain visibility platform that helps organizations transform entire supply chains by focusing on the details that matter most. From real-time transportation visibility to yard and appointment management, inbound freight visibility and dynamic oceanic visibility, FourKites improves inventory management and planning while improving yard efficiency and OTIF compliance.

 

Location: Boston, Massachusetts

Starburst Data allows organizations to access distributed data capabilities by providing them with a platform that offers a single point of access to all company data. The platform features an ultra-fast SQL-based MPP query engine built on Trino to provide teams with a query tool that is separate from the platform’s data warehousing system, perfect for data scientists, analysts, marketers and finance teams at companies of all sizes.

 

Location: Los Angeles, California 

Centerfield operates a customer acquisition and engagement platform that communicates with and discovers new insights about customers across both online and offline channels. The company relies on its data acquisition techniques to offer marketing and sales services that help brands of all kinds close deals more efficiently with their customers, with discovery, messaging, call center, e-commerce and tech services available helping close sales and discover insights that lead to better leads.

 

Location: Boston, Massachusetts

Klaviyo provides e-commerce companies with access to in-depth analytics that help them reach new audiences and significantly increase sales. The platform helps marketers unlock insights into building and scaling a brand, as well as for taking a brick-and-mortar store online, featuring segmented customer data and a vast amount of integrations to ensure Klaviyo can boost businesses of all sizes in any industry.

 

Location: Boston, Massachusetts

ChaosSearch’s data analytics platform receives cloud data from businesses and establishes an index that is searchable, SQL-enabled and operates machine learning workloads. The company strives to provide businesses with faster insights with unlimited scalability and at a lower cost.

 

Location: Fully Remote

DataGrail is a privacy platform that simplifies, automates and scales data privacy programs. With the integration of over a thousand apps, the company organizes data from across the apps into features like a live data map. Additionally, DataGrail monitors risk factors and eliminates any errors in manual processes.

 

Location: New York, New York

Howl’s e-commerce platform works to foster the relationship between creator and brand partnerships and the boost the voice of both parties. Using curated brand and customer data, the Howl platform offers sponsorship deals for online creators based on brands most relevant to the creator and the creator’s audience. Each creator’s dashboard is personalized, featuring real-time insights about the best-matching brands as well as the most popular posts and products with customers.

 

Location: Cambridge, Massachusetts

InterSystems is a leader in producing digital products that ties data together across the healthcare industry, as well as for life sciences, financial and governmental organizations. The company’s flagship product is its IRIS Data Platform, offering an intuitive method of building and deploying cloud-first applications with machine learning capabilities to close the gap between data and application silos and create better connectivity between providers, payers and patients.

 

Location: New York, New York 

CB Insights mines data across a vast range of touchpoints to predict technology trends and help companies grow across industries. The company’s software analyzes data in patents, venture capitalist financings, business relationships, news, market maps, social media and more to provide companies with granular insights into potential areas of growth and decline to be aware of, allowing decision-makers across organizations to feel more confident about their planning.

 

Location: Oakland, California 

Fivetran automates data integration, from source to destination, for a more efficient data analysis process. The platform comes loaded with features like ready-to-query schemas, SQL-based transformations, incremental batch updates and pre-built connectors, that help to save data teams valuable time and resources. Optimizely, Square and DocuSign are just a few of the major companies that use Fivetran to automate and manage their big data integrations.

 

Location: Seattle, Washington

Qumulo is the creator of the first universal-scale file storage system called Qumulo File Fabric or QF2. The system stores as many files as needed, regardless of size, and moves data wherever it’s needed, making it accessible in the cloud or on-premise.

Find out who's hiring.
See all jobs at top tech companies & startups
View 10000+ Jobs

 

Location: New York, New York

With data breaches becoming more and more common, BigID provides companies with the tools to protect and manage personal data privacy. BigID solutions to keep companies GDPR-compliant include automating data mapping, identifying impacted users after breaches and more.

 

Location: San Francisco, California

Databricks is an analytics platform that unites data engineering and data science. The company’s solutions unify data and machine learning while also reducing the complexities of infrastructure. Major companies like HP, Live Nation, Edmunds, Samsung and Cisco are using Databricks.

 

Location: American Fork, Utah

Domo connects all employees to decision-driving data within their businesses. The platform provides real-time data with over 300 interactive dashboards and charts. Companies like DHL, The Honest Company and Mastercard use Domo to increase productivity and create data-driven cultures.

 

Location: Redwood City, California

Alation provides a collaborative data catalog for enterprises to collect and understand the most relevant information for its businesses. The company connects with major data platforms like Oracle, IBM, MySQL, Cloudera and Teradata to create one resource for all data-facing employees in a company. Its products are used by some of the largest finance firms in the world, as well as major brands like eBay, Square and Pfizer.

 

Location: Mountain View, California

For the last two decades Google has evolved and exponentially grown from the search engine we all know and use to the multinational company it is today. With dozens of products and services, Google keeps immense amounts of information and data organized and accessible to its users. Additionally, the search giant has used big data, artificial intelligence and machine learning to improve existing products and propel innovation.

 

Location: Armonk, New York

In addition to its myriad technology products and services, IBM supplies analytical solutions to help companies wrangle data effectively. The company’s focus for data is to ensure it is built on a solid foundation while making it simple and accessible with scaled insights. As one of the largest tech companies in the world, it’s no surprise IBM is creating big data solutions for its customers.

 

Location: San Francisco, California

Salesforce is one of the most well-known CRMs. The powerful tool lets companies log, manage and analyze customer data, information and activity. The platform is accessible from any location and helps sales and marketing teams integrate across apps and devices to ensure all crucial customer data is in one place.

 

Location: San Jose, California

Cohesity supplies a hyperconverged storage platform, consolidating secondary data and removing the need for secondary storage silos. Utilized by the federal government and enterprise customers, Cohesity easily stores all files, backups, test/dev and analytics in one place (hence the name).

 

Location: San Francisco, California

New Relic is an application performance monitoring solution delivered as a SaaS product. With full-stack visibility, the solution helps developers, IT teams and companies analyze large quantities of data and unravel insights in real-time.

 

Location: San Diego, California

Teradata is a software and data company providing a variety of products to help with analytical challenges and queries. The platform provides analytics at scale, simplifying user experience. Teradata is used by major brands like Verizon, P&G, Columbia Sportswear, American Red Cross and Warner Brothers.

 

Location: Palo Alto, California

VMware provides cloud infrastructure and digital transformation technology. The company’s services span industries like education, finance, healthcare, manufacturing and retail to support both business and technological needs.

 

Location: Palo Alto, California

SAP’s data management suite operates across the cloud, enabling free movement between data systems and applications, while providing a unified view of data for enhanced monitoring and analytics.

 

Location: Redwood Shores, California

Oracle is a computer technology company providing a variety of database products and solutions. The company’s cloud platform helps enterprises make data useful through visualizations, machine learning models and predictive analytics.

 

Location: Santa Barbara, California

HG Insights offers B2B products to help growing businesses receive actionable insights from market data and scale targets accordingly. The HG Market Intelligence platform sizes and analyzes market data to find relevant trends and threats, providing customizable insights for product, sales and marketing initiatives.

 

Location: Irvine, California

Alteryx is an analytics platform offering end-to-end solutions for both data scientists and business analysts. The technology lets multiple teams work together to find solutions within their data. Alteryx partners with developers, analytics experts and the leading systems integrators to train and support its customers, achieving better business outcomes.

 

Location: Mountain View, California

BigPanda supplies an autonomous digital operations solution to companies like Macy’s, Riot Games, Shutterfly and Cisco. The platform enables IT Ops teams to correlate operational data and alerts, automate incident responses and streamline workflows.

 

Location: Redwood Shores, California

Reltio is a cloud data management platform for companies and organizations in industries ranging from finance and healthcare to life sciences and oil. The self-learning platform organizes all types of data at unlimited scale, unifying datasets and integrating analytics for business operations and processes.

 

TOP BIG DATA FRAMEWORKS BEST COMPANIES USE | Video: Jelvix

 

Location: San Francisco, California

Crunchbase is a leading source of information and news about global companies, industry trends and investments. The online destination supplies a wide breadth of data and tools from startup funding news and large acquisitions to data integration products for enterprises.

 

Location: Chicago, Illinois

Founded by Dan Wagner, who served as the chief analytics officer on President Barack Obama’s 2012 re-election campaign, Civis Analytics creates cloud-based analytics solutions so companies can answer crucial questions and build their businesses. The company’s proprietary algorithm unifies data and ensures datasets are clean and prepared for modeling. Civis works with Fortune 500 companies as well as prominent nonprofits like the Bill & Melinda Gates Foundation, using data to solve a wide range of problems.

 

Location: Fully Remote

DRINKS bills itself as a wine-as-a-service platform for e-commerce retailers. The company provides an AI-based platform to generate a network of wine products on the retailers’ app or website. From there, retailers can optimize online marketing and merchandising, track consumer data and ship to 42 states in the U.S.

 

Location: Rockville, Maryland   

Micro Focus offers multiple enterprise software products, solutions and services for businesses looking to modernize core data and IT operations. The company’s analytics SaaS tools provide full-stack data visibility and storage, including for big data analytics. Using predictive AI, Micro Focus software can automate insights and risk detection for customer behavior, cognitive search, IoT and related datasets.

 

The Org hosts a database of 200,000 public organizational charts, providing detailed overviews of each company’s employees (including role, interests and past experience) and overall transparency on how the business is structured. Its platform showcases real-time data and insights into company workforces, letting prospective employees know exactly who they’ll be working with as well as matching sales professionals with curated customer leads.

 

Location: Cambridge, Massachusetts

Tamr provides data unification solutions to enterprises like GE, HP, GSK and Toyota. Utilized for both clinical and customer analytics, companies can spend less time preparing data and more time understanding customer behavior and utilizing existing clinical knowledge.

 

Location: Fully Remote 

SupportLogic is ensuring companies keep customers at the center of their business strategies. Combining machine learning and natural language processing, SupportLogic’s platform analyzes tickets to detect trends, keywords, and possible churn. Customer service teams can then take a more proactive approach, locating unhappy customers and handling their needs before escalations occur.

 

Location: Austin, Texas

RudderStack designs customer data infrastructure solutions for developers. The company specializes in enterprise-level data collection and sourcing while syncing the collection to every tool in your data stack. Developers can stream events on websites and applications. Additionally, the product features SDK identity resolutions, data governance and event transformations to harness a holistic view of customer data.

 

Location: New York, New York    

Developing custom software products for clients, Vention specializes in web, mobile, cloud and more services, including consulting for big data leverage. The company’s data engineering service teams are able to assist in robust data pipeline management, third-party system integration, data storage organization and business intelligence visualization. Vention’s analytics technologies are also powered by IoT and AI, making them effective for data insight across fast-changing industries like fintech, advertising, retail and healthcare.

 

Location: McKinney, Texas    

ScienceSoft delivers software development and IT consulting services, with designated big data expertise. Big data application testing, database architecture implementation as well as advisory and managed services are available from the company. In addition, ScienceSoft offers focused consulting and support assistance for big data frameworks like Apache Hadoop, Apache Spark and Apache Cassandra.

 

Location: New York, New York    

Oxagile creates customized software and service solutions for clients, one area of which is its big data analytics services. The company’s full-cycle delivery team includes specialization in big data product development, analysis and pipeline integration, processing and storing and more. Oxagile has utilized its tools to build big data solutions such as video advertising and monetization platforms, online video monitoring systems and yield optimization platforms across the media and adtech industries.

 

Location: Atlanta, Georgia     

RightData’s software solutions house a myriad of data management and testing tools, with big data migration testing offered by its RightData tool. The tool’s testing process involves cycles of data staging, MapReduce and output validations in order to ensure a steady migration of datasets to the cloud or other required storage. RDt’s testing engine also allows users to automate testing and create testing scenarios between disparate systems.

 

Location: Austin, Texas

Tableau makes solutions that help people understand their data through visualization, analysis and sharing across devices. With interactive and organized dashboards, Tableau enables easy collaboration and provides the analytical insights needed to answer questions and make decisions.

 

Location: Palo Alto, California

Cloudera supplies a cloud platform for analytics and machine learning built by people from leading companies like Google, Yahoo!, Facebook and Oracle. The technology gives companies a comprehensive view of its data in one place, providing clearer insights and better protection.

 

Location: San Francisco, California 

Sift utilizes the power of machine learning to proactively prevent fraud that impacts on both consumers and companies, helping organizations grow revenue and establish trust with customers across a range of industries. The platform’s Digital Trust & Safety Suite includes tools for protecting payments, keeping spam away from content and defending accounts from takeovers, with a library of resources online to provide users with data on the state of widespread fraud and prevention.

 

Location: San Francisco, California 

UserTesting has been utilized by a number of leading enterprises, such as Facebook, Walmart and CapitalOne, to capture real-time feedback from customers to close gaps between companies and their audiences. The platform features targeting, engagement, discovery and data sharing tools like self-guided videos of customers interacting with websites and review metrics to develop a clearer understanding of customer experiences, ensuring users are accurately embracing the experiences that companies set out to provide.

 

Location: San Francisco, California 

Segment is a customer data platform that collects information from user events on hundreds of web and mobile apps, and centralizes them in one location, to provide marketing, product and engineering teams with better insights. The big data analytics platform collects individual customer data across multiple channels and stores it all in one location, so marketing and data teams can get granular views into consumer habits. IBM, Docker, Atlassian and Instacart are just a few of the major brands that use Segment to capture and manage big data.

 

Location: San Francisco, California

Datameer is a hybrid cloud platform where business analysts and data scientists can create pipelines from any data source in any location. Datameer’s products are used by companies and organizations in the health, finance and telecommunication industries to deliver strategy-driving data.

 

Location: Redwood City, California

Informatica’s intelligent data platform collects data from any source, no matter how fragmented, and transforms it into a safe and accessible dataset. Its modular platform gives companies the flexibility to scale, adding management products as data grows.

 

Location: Northbrook, Illinois

Mu Sigma is a leading provider of analytical solutions around the world. Its software lets companies scale problem solving and decision making through a virtual chain of mapping, analytics execution and operationalization.

 

Location: Santa Clara, California

Qubole is a self-managing, self-optimizing cloud-based data activation platform. Constantly learning about a company’s data, the platform is made for any employee who interacts with data regularly, enabling them to increase collaboration and focus on business outcomes.

 

Location: Palo Alto, California

Striim is a streaming integration solution for real-time data movement into data warehouses, databases and other analytical systems. Companies can create data pipelines for cloud and data integration, detect security threats, fraud and other risks in real-time.

Da’Zhane Johnson, Margo Steines, Rose Velazquez and Sara B.T. Thiel contributed reporting to this story.

Hiring Now
Arrive Logistics
Logistics • Sales • Software • 3PL: Third Party Logistics
SHARE