AsiaTechDaily – Asia's Leading Tech and Startup Media Platform
SingleStore, a real-time data platform, has announced a bi-directional integration with Apache Iceberg, an emerging popular open-source table format for diverse datasets.
The integration aims to unlock the potential of data lakehouses by addressing a critical challenge: approximately 90% of data remains “frozen” and unusable for interactive applications, analytics, or AI. The new integration will allow enterprises to leverage this data to build intelligent applications and enhance both transactional and analytical operations.
In addition to the Apache Iceberg integration, SingleStore has unveiled improvements in vector search performance, enhancements to text search capabilities, new autoscaling features, and a new cloud offering that allows customers to deploy SingleStore on their private cloud.
These advancements collectively aim to simplify data management and empower organizations to utilize their data more effectively, ensuring real-time operations with low-latency ingestion and bidirectional data flow.
Apache Iceberg has become the de facto standard for data lakes, offering an efficient way to store massive datasets and query large-scale data cost-effectively. However, enterprises face challenges in utilizing this data due to the complex and costly processes required to ‘thaw’ it, involving extensive ETL workflows and compute-intensive Spark jobs.
SingleStore’s new integration with Apache Iceberg will address these challenges by providing low-latency ingestion, bi-directional data flow, and real-time application performance at lower costs for modern intelligent applications and analytics.
This integration, now available in public preview, enables customers to create external tables in SingleStore based on Iceberg data, build projections on these tables, and use SingleStore’s speed and performance on previously inaccessible frozen data.
SingleStore’s customers, many of whom use Apache Iceberg for their data lakes and commercial lakehouses, have shown a strong interest in a solution that supports fast interactive applications and low-latency analytics using their vast, untapped Iceberg data.
In addition to the Iceberg integration, SingleStore has announced enhancements to its vector search capabilities, improving performance by about 40% with advanced algorithms. The release also includes improved full-text search features, such as relevance scoring, phonetic similarity, fuzzy matching, and keyword proximity-based ranking. These enhancements reduce the need for specialized databases, making it easier to build generative AI and real-time applications.
“Our vision has always been to provide one single data store for all companies to be able to take advantage of speed, scale and simplicity,” said Raj Verma, CEO of SingleStore. “Our data platform is designed to unlock all types of enterprise data — including data that is frozen in data lakes — to enable our customers to build modern intelligent apps. With this release, we believe we are enabling a significant portion of the market that today cannot build real-time modern applications on data stored in data lakes.”
Apart from the Iceberg integration, SingleStore has announced several new features and product enhancements designed to facilitate the creation of enterprise-grade intelligent applications. These include faster vector search capabilities, improved full-text search, autoscaling, and a fully managed private cloud offering known as Helios.
The enhanced vector search, utilizing the HNSW algorithm, is now 40% faster compared to the previous release. SingleStore’s IVF Flat index has shown to be between 47 to 100 times faster than pgvector, and vector index build times are now two to three times than Milvus and pgvector. This feature supports a range of searches and filters, enabling enterprises to efficiently build and scale generative AI applications.
In the context of full-text search, SingleStore has introduced new capabilities, including improved relevance scoring, phonetic similarity, fuzzy matching, and keyword proximity-based ranking. These enhancements allow organizations to simplify their data architectures by eliminating the need for additional specialty databases, thereby facilitating the development of generative AI and real-time applications.
Also Read: