Tencent Music: From Elasticsearch to Apache Doris Content Library Upgrade, Unified Search and AnalysisEngine, Cost Reduced by 80%
By: Zhang Jun, Luo Lei,Li Jipeng, Dai Kai, Tencent Music Content Information Platform Department
Introduction:
Tencent Music Entertainment boasts a vast content library encompassing recorded music, liveperformances, audio, and video. To empower its users with a richer experience and provide enhanced support for musicians and partners, Tencent Music has developed a comprehensive content librarydata platform. This platform centralizes data like song libraries, artist information, album details, and label information, enabling services like inventory analysis, user segmentation, metric analysis, and tag selection for efficient business empowerment.
This article delves into theevolution of the content library data platform, specifically focusing on the transition from Elasticsearch to Apache Doris for the content search engine. This upgrade aimed to unify the content search and analysis engine, enabling complex custom tag calculations while significantly reducing costs and improving performance.
Business Needs:
Tencent Music’s content library data platform serves two critical business needs:
- Content Library Encyclopedia Search: Analysts and operators require rapid access to information like singer names, song titles, and other textual data. This necessitates efficient full-text search capabilities along with support for diverse query conditions,ensuring swift data retrieval and enhanced productivity.
- Content Library Tag Selection: Analysts and operators need to filter content based on specific tags and conditions, requiring the system to handle billions of data points and deliver sub-second query responses for rapid data analysis and informed decision-making.
Elasticsearch + Doris Hybrid Architecture:
Prior to version 2.0, Tencent Music employed a hybrid architecture combining Elasticsearch and Doris. Elasticsearch was responsible for content search, while Doris handled data analysis. This setup, however, presented challenges:
- Redundant Data Storage: Both Elasticsearch and Doris stored the same data, leading to increased storage costs and potentialdata inconsistencies.
- Separate Search and Analysis Engines: The fragmented architecture hindered unified data management and analysis, requiring separate tools and processes for search and analysis.
- Limited Custom Tag Calculation Support: Elasticsearch’s capabilities were insufficient for complex custom tag calculations, limiting the platform’s analytical capabilities.
Apache Doris: A Unified Solution:
Recognizing these limitations, Tencent Music decided to migrate from Elasticsearch to Apache Doris for its content library data platform. Doris, with its robust capabilities in inverted indexing and full-text search, offered a unified solution for both content search and data analysis, addressing the shortcomings of the previous architecture.
Key Benefits of the Upgrade:
The transition to Apache Doris yielded significant benefits:
- Unified Search and Analysis Engine: Doris consolidated the search and analysis functions, simplifying data management and enhancing analytical capabilities.
- Enhanced Custom Tag Calculation Support: Doris’s advanced indexing and search capabilities enabled complex custom tag calculations,empowering analysts with deeper insights.
- Improved Performance: Doris delivered a 4x improvement in write performance, enabling faster data ingestion and processing.
- Reduced Costs: The unified architecture eliminated redundant data storage, resulting in an impressive 80% reduction in storage costs.
Conclusion:
Tencent Music’s migration from Elasticsearch to Apache Doris for its content library data platform exemplifies the power of a unified solution for content search and data analysis. The upgrade not only enhanced the platform’s functionality and performance but also significantly reduced costs. This case study demonstrates the potential of Apache Doris as a robust and cost-effective solution for organizations seeking tooptimize their data management and analysis processes.
References:
Views: 0
