科技研发

R&D

科技研发

R&D

数据科学

Data Science

随着大数据正在成为推动经济转型发展的新动力,数据科学领域也随之将对未来信息产业格局产生重要影响。数据科学广义上是通过对数据进行处理,从而从数据中获取有价值的信息,进而形成知识的过程。数据科学领域涉及的学科有很多,包括统计学、计算机科学、工程学等。

 

As big data is becoming the new power of promoting the economic development, data science will also impact the information industry structure in the future. In a broad way, data science is the process that obtaining the valuable information through processing the data, and then turning it into the knowledge. The data science field involves many disciplines, including statistics, computer science, engineering and so on.

 

为响应国家发展大数据战略的重要举措,集团公司依托中央研究院核电软件技术中心成立了集团公司的大数据中心,通过实现数据融合和基于海量数据的价值挖掘,全面应用于优化设计、科学选型、规范生产、优化管理、智能分析等方面,为公司竞争力提供有力的支持。
 

To respond the important measure of developing the big data strategy, the company decides to let SNPSDC set up the big data center, which provides strong support for the company competitiveness by realizing data integration and big data mining, and applies to optimal designing, scientific selection, normative production, optimal management, intelligent analysis and so on. 

大数据平台


基于Apache Hadoop研发一站式大数据综合平台,通过提供数据存储、分布式计算、数据分析挖掘、数据可视化的整套支持,建立一个统一的数据和计算平台,在此一站式大数据平台上可以进行数据采集、存储、查询、分析、搜索、挖掘和可视化等工作。

 

Develop the big data platform based on Apache Hadoop. Establish a unified computing platform by providing data storage, distributed computing, data mining and data visualization. You can collect, store, query, analyze, search, mine, and visualized data on this big data platform.

机器学习
 

通过对样本数据集进行训练,不断优化参数,建立针对某个问题的模型。机器学习算法库提供包括分类、聚类、预测、推荐等并行化的高性能机器学习算法,可用于构建高精度的推荐引擎或者预测引擎。

 

Build model for an issue by training the sample data and optimizing parameters constantly. Machine learning algorithms library offers parallelized high-performance algorithms, including classification, clustering, prediction, recommendation and so on, which can be used to build high-precision recommendation engine or prediction engine.

 
数据挖掘与分析

 

R语言统计计算库提供强大的主流数据统计和绘图语言R以及Web图形化开发界面。通过建立和调用内置并行算法库,支持对大数据集进行数据挖掘和分析。

 

R language statistical computing library provides powerful mainstream data statistics, graphical language R and Web graphical development interface. Through establishing and calling the built-in parallelized algorithms library, supporting the mining and analysis for big data sets.

数据可视化


采用动态图或多图形式展现不同数据源间的差异,结合多源异构数据融合、变换及摘要方法,采用单一大图展示数据总体结构及特征,有利于用户宏观可视分析大数据。

Show the difference between different data sources by using dynamic graph or multi-graph form. Combining multi-source data integration and transformation and abstract method, and displaying the data structure and characters by using a single large graph, can help users visualize and analyze the big data broadly.

 
数据服务
 

大数据中心可以在集团乃至整个行业内实现数据共享。首先,大数据平台可通过企业大数据门户提供给全集团领导和员工使用,支持公司在管理、经营、科研、生产等方面的工作。其次,大数据中心可以为行业内各单位提供产品、数据支持和技术服务,一些大数据应用的成功案例可以通过企业大数据门户向全集团进行宣传,以促进大数据平台的推广使用。

 

The big data center can achieve data sharing in the whole company and even industry. First, leaders and staff of the company can use the big data platform through enterprise big data web portal to support the company in management, scientific research, production and so on. Second, the big data center can provide products, data support and technical services for the units in the industry. Some successful cases in big data applications can be promoted through enterprise big data web portal to the whole company to promote the use of the big data platform.