深度学习(deep learning)可以说是神经网络的品牌重塑，因为它继承了神经网络研究中很多关键的算法技术。它因为最近在图像识别与语音识别领域取得突破性的成功而再次得到了大家的关注[1, 2]。两个关键因素使得深度学习获得如此大的成功：计算能力的大幅度提升以及训练数据的大规模增加。 现在大部分开源的深度学习软件工具和平台都是用单个的GPU节点，这种方法不仅限制了模型的规模也限制了训练数据集的规模。
分布式训练方法能帮助大规模的深度学习训练，而且也得到了学术界和工业界的关注。利用我们已有的开发分布式数据库系统的经验，我们开发了SINGA(狮子), 一个Apache开源的分布式深度学习平台。SINGA具有三方面的特性，可用性，可扩展性和外延性[5, 6, 7, 9]。SINGA的模型很容易地让别人使
Deep learning is one of the most popular topics in compute science in recent years. It has boosted many complex data driven applications such as image classification and speech recognition. Database community has worked on data-driven applications for many years. However, databases and deep learning are different in terms of techniques and applications. In , we discuss research
We are building a new distributed Universal data storage system, called UStore, which
Like other application domains, doctors and medical specialists are now interested in exploiting healthcare data for better healthcare and disease prevention, and also better utilization of resources. However, the data although is deposited in a central, possibly national scale, storage systems, the data is owned by different healthcare providers who will guide the data zealously, due to its commercial value
Figure 1. Landscape of Modern Database Systems
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scaling to an energy utilization problem. Traditionally, datacenters employ x86-64 (big) servers with power usage of tens to hundreds of Watts. But lately, low-power (small) systems originally developed for mobile devices have seen significant improvements in performance. These improvements could lead
The widely adopted single-threaded OLTP model assigns a single thread to each static partition of the database for processing transactions in a partition. This simplifies concurrency control while retaining parallelism. However, it suffers performance loss arising from skewed workloads as well as transactions that span multiple partitions. In this paper, we present a dynamic single-threaded in-memory
In-memory databases have gained a lot of traction in recent years due to sustained drop in memory cost and also increase in memory size and speed. Indeed, as has been said many years ago: "memory is the new disk, disk is the new tape". Figure 1 shows the development of database systems in recent years.
R-Store  is a scalable distributed system for supporting real-time OLAP by extending the MapReduce framework. We extend an open source distributed key/value system, HBase, as the underlying storage system that stores data cube and real-time data. When real-time data are updated,
The need to locate the K-nearest data points with respect to a given query points in high-dimensional space is used as the basic operation in many applications. Tree-structure based approaches do not scale up in terms of dimensionality due to the curse of high-dimensionality.
随着数据量的剧增以及对数据进行复杂分析(analytics)的需求日益迫切，分布式系统必须不断地增加节点、扩大规模以应对巨大的工作量。显然，节点数量的增加不可避免地会导致节点故障的频发。因此，有效的故障恢复策略对于分布式系统来说非常重要。在现有的分布式系统中，存在两种常见的故障恢复(failure recovery)策略：一种是基于检查点（Checkpoint recovery）的恢复策略，另一种是基于密闭回收(Confined recovery)的恢复策略。最近